The easiest way to sound unprepared in an exam is to answer an AI question like it is a technology strategy discussion. It is not. The examiner is not there to decide whether large language models are impressive. They are trying to decide whether a system that touches spreading, document analysis, risk flagging, memo support, or any other part of the credit process is governed tightly enough that the bank still owns the decision.
That is why the questions are so consistent. They are all control questions. Can you inventory the model. Can you show what happens when it is wrong. Can you reconstruct the lifecycle of a real file. Can you prove the rules are applied consistently. Can you explain what changed when the model changed.
The full regulatory version of this sits in our guide on examiner readiness for AI lending and the broader AI-Assisted Underwriting Playbook. This post is the practical field version. If an examiner asks these five questions, here is what they are really testing, what weak answers sound like, and what good answers look like in a real lending shop.
Scannable summary
What each question is really testing
| Examiner question | What it really means |
|---|---|
| Show me the model inventory | Do you actually know what is running in the credit process, who owns it, and what authority it has? |
| What happens when the AI is wrong? | Did you build for reality, or only for the demo? |
| Can you reconstruct a file right now? | Is the audit trail live and usable, or fragmented across systems? |
| Are thresholds applied consistently? | Is policy enforced centrally, or is judgment drifting analyst by analyst? |
| When did the model change, and how did you validate it? | Do you control change management, or are you outsourcing accountability to the vendor? |
1. "Show me the model inventory."
This sounds administrative. It is not. The examiner is starting with the most basic test of control: does the institution know what systems are in scope. A lot of banks fail this because their "inventory" is really a vendor list. It says the bank uses Vendor X for underwriting automation and stops there. That tells an examiner almost nothing.
A real model inventory breaks the workflow apart. Document classification is one function. tax return extraction is another. ratio calculation is another. policy-based risk flags are another. memo support is another. Each one should have a business purpose, owner, production version, inputs, outputs, known limitations, and explicit decision authority. If a tool can surface a DSCR, trace K-1 ownership, or assemble exception support that materially shapes the recommendation, it belongs in the artifact.
Weak answer
"We use an AI underwriting platform. Compliance has the contract and security packet."
Good answer
"Here is the inventory entry for each AI-supported function in underwriting, including the current version, the internal owner, the use limits, and the point where human review is mandatory."
In the real world, this matters because banks almost always adopt AI one painful workflow at a time. They start with something like financial spreading, then layer in risk flags, then memo support. If your inventory cannot keep up with that creep, your governance is already outdated.
2. "Walk me through what happens when the AI is wrong."
This is my favorite examiner question because it cuts through bullshit fast. Every vendor claims high accuracy. Every bank says humans stay in the loop. Fine. Show the bad case.
In commercial lending, the bad case is never abstract. It is a scanned 1065 where officer compensation is split across a continuation statement. It is a borrower package with amended returns and originals both included. It is a CRE file where a rent roll, trailing twelve, and tax return all describe revenue differently. Good institutions do not act embarrassed when these happen. They built the workflow assuming they would happen.
A strong answer shows the original AI value, the human correction, the reason for the override, the user who made it, and the timestamp. Better still, it shows whether that miss rolled into monitoring. If five analysts corrected the same line item across the same document type in the last month, that is not just a one-off file issue. That is signal for validation and training.
Weak answer
"Our underwriters would catch it."
Good answer
"Here is a real file where the model misread a continuation schedule. The underwriter corrected it, the original stayed in the record, and the case fed our monthly override review for that document type."
What good looks like: the override is treated as evidence of healthy governance, not as proof the whole system failed. If the product hides the original value after a human fix, it is making the file less examiner-ready, not more.
3. "If I pick a random loan file, can you reconstruct the full decision lifecycle right now?"
This is the audit trail test. And no, a generic statement like "we log everything" does not pass it. Examiners want to see whether one completed file can be walked from first upload to final approval without a scavenger hunt.
This question gets painful when the workflow is split across the LOS, a vendor portal, shared drives, email chains, and spreadsheet trackers. The loan officer requested missing documents in one system. The analyst corrected a spread in another. The memo lives in Word. The approval chain sits in email or committee minutes. Technically, the information exists. Practically, nobody can reconstruct it live while an examiner is watching.
A good answer does not require a report request to IT. Pick a representative commercial file, not the cleanest demo file in the building, and show document receipt, extraction, overrides, exception handling, memo authorship, and approval timestamps in sequence. If you have to narrate missing pieces from memory, the control is weaker than you think.
Weak answer
"The information is all there. It just lives in different places."
Good answer
"Pick any completed file. We can show the source docs, extracted values, every override, every dismissed flag with reason, memo support, and the final approval chain from one live workflow."
If you want the deeper governance framework behind this, the examiner readiness guide breaks down the decision authority matrix and the evidence packet that makes this possible.
4. "Are the same thresholds applied to every application?"
This is the question where model risk and fair lending start holding hands. The examiner is probing for hidden inconsistency. They want to know whether policy thresholds live in a controlled library or inside the heads of individual analysts.
In practice, this is where manual shops drift the most. One analyst flags DSCR under 1.25x aggressively. Another is comfortable at 1.20x if guarantor liquidity is strong. One analyst dismisses revenue concentration because the borrower has been in the market for 20 years. Another writes it up every time. Some of that is legitimate judgment. Some of it is inconsistency disguised as experience.
A good AI-assisted workflow does not eliminate judgment. It standardizes the triggers. The thresholds, policy version, exception routing, and dismissal requirements should be centrally administered. If the bank changes a covenant alert threshold or a minimum liquidity rule, there should be an approval path and an effective date. Analysts can still escalate nuance. They should not be quietly editing the rules deal by deal.
Weak answer
"The analysts know the standards, and managers review anything unusual."
Good answer
"Here is the centrally managed threshold library tied to policy. Here is who can change it, how changes are approved, and how we monitor override and dismissal patterns by user and product."
This is one reason many banks start with document extraction and spreading before anything more judgment-heavy. The workflow is easier to govern cleanly. Our playbook lays out that sequencing, and it is still the right move.
5. "When did the model version change, and what validation did you do?"
This is where a lot of "continuous improvement" stories fall apart. Vendors love saying the model gets better every week. Fine. An examiner hears that and immediately thinks: then how do you know what logic was in production on the day this file was underwritten?
A real answer has dates, versions, and evidence. The bank should know what changed, why it changed, what was tested before release, which edge cases were part of the test set, who signed off internally, and what heightened monitoring followed. If your institution keeps a golden dataset of real-world files, including ugly multi-entity tax packages and scanned statements that have burned analysts before, re-running that set before production changes is table stakes.
This is exactly where SR 11-7 and OCC Bulletin 2025-26 meet operational reality. Community banks do not need a giant annual validation ritual for a narrow analyst-assistance tool. They do need a risk-based process that proves changes are controlled and documented.
Weak answer
"The vendor improves the model continuously, and we have not seen any major issues."
Good answer
"Version 4.3 went live on March 18. We re-ran our golden dataset, documented no regression on clean 1040s and improved performance on scanned 1120-S schedules, then approved release with two weeks of heightened override monitoring."
Hard truth: if the vendor can change production behavior and you cannot tell an examiner exactly when it changed, you do not control the model. You rent it.
The practical prep move most teams skip
Keep one representative completed loan bookmarked as your live demonstration file. Not the easiest file. Not the prettiest file. A real commercial file with enough complexity to prove the controls are real. If your team can walk that file cleanly in under five minutes, most of the hard work is already done.
That one habit forces discipline upstream. Your inventory has to be current. Your override flow has to preserve history. Your lifecycle logging has to be usable. Your threshold management has to be centralized. Your version history has to exist. In other words, the demonstration file exposes whether your governance is operational or just documented.
If you are still early, start there. Then work backward into the packet. Inventory. validation memo. decision authority matrix. change log. monitoring summary. That is a much better use of time than polishing a slide deck nobody will trust once the examiner starts clicking around.
FAQ: examiner questions about AI in lending
What do bank examiners ask about AI in underwriting?
The questions are usually operational, not theoretical. Examiners ask to see the model inventory, the override workflow when AI is wrong, the full decision lifecycle on a real loan file, whether the same thresholds are applied consistently across applications, and how model updates are validated and approved.
Do examiners expect community banks to have a full enterprise AI governance program?
No. They expect a program proportionate to the risk. OCC Bulletin 2025-26 makes that clear. A community bank using AI for analyst-assistance still needs model inventory, validation, monitoring, and human decision authority, but not a money-center governance stack built for high-risk autonomous decisioning.
What is a bad answer when an examiner asks about AI controls?
Anything that depends on improvisation. "The vendor handles that," "our underwriters would catch it," or "we log everything somewhere" are all weak answers. Good answers show an actual document, a real workflow, or a live audit trail the examiner can review immediately.
How should banks validate AI model updates in lending workflows?
Banks should re-run a maintained golden dataset, document what changed, measure whether performance improved or regressed by document type, record approval from the internal owner, and monitor post-release overrides. Continuous vendor improvement is not a substitute for bank-side change management.
Going deeper? Read the full examiner readiness guide, the broader AI-Assisted Underwriting Playbook, or see how governed workflows apply in practice for SBA underwriting and commercial lending operations.