When regulators recently accused AI-based lending software companies of creating “black boxes,” the fintechs pushed back. Fintechs and consumer advocates say the classic FICO credit score and banks’ traditional loan underwriting programs based on it are not transparent and keep already underserved people shut out of mainstream credit.
Will Lansing, FICO's chief executive, says his company is evolving the FICO score by making use of some of the same alternative data fintechs use.
“All the lenders and all the fintechs and FICO share the same desire, which is to get as much credit into responsible hands as we possibly can,” Lansing said. “So wherever there's an underrepresented population, a group that we're not able to evaluate, we are all on the hunt to figure out how to get credit to these people.”
Consumer advocates and fintechs say a closer look needs to be taken at traditional credit scores and models, perhaps using a recently released fairness framework.
The 'black box' critique
Many fintech lenders and others say the FICO score is assembled in an opaque manner that could hide all kinds of bias.
“The biggest black box out there is FICO,” said Teddy Flo, chief legal officer at Zest AI, an AI lending software provider. “What FICO forgets to say is it uses a form of machine learning. And they will not tell you what features are in that model. They don't provide fair- lending reports on their model.”
For consumers, “generally folks feel that the FICO score is not as transparent as they would like, in terms of understanding how decisions are being made around their credit or individual circumstance,” said Ulysses Smith, head of diversity, inclusion and belonging at the mortgage software company Blend.
Smith, who is going through a mortgage loan process himself (he’s not using Blend software), said the process is still painful for consumers.
“A lot of information is presented to the consumer in a way that is not necessarily accessible,” he said.
The five criteria of the FICO score are spelled out: 40% is based on payment history (people are penalized for late or missed payments); 35% is outstanding debt (people are hurt by having too much debt); 10% is account types (diverse types of credit are good); 10% is credit age (longevity and consistency are valued); and 5% is new activity (a flurry of new accounts can be a red flag).
Lansing said the score his company developed is “a very transparent model.”
“The FICO score is 100% explained,” he said. “We say this is the complete and comprehensive set of variables that are going into the decision. And these are the weights that we put on them. And this is the decision that came out of it. That's how we're able to, with confidence, provide reason codes when lenders turn down consumers or give them a different size credit line than they asked for.”
The Catch-22 of credit scores
Another critique of the FICO score and bank loan decision models that rely heavily on it is that they can perpetuate bias because they put a lot of weight on credit history. The logic is, the best way to predict whether someone is going to pay back a loan is to look at whether they've paid back credit in the past.
This reasoning, of course, favors people who have taken out credit in the past and paid it back on time.
But for people who have been discriminated against in the past and therefore have been denied credit much of their lives, this creates a vicious cycle in which they are less likely to get a loan today.
Some of this stems from where banks place their branches, consumer advocates say.
“In the United States, we have a bifurcated or a dual credit system in which banks are hyperconcentrated in white communities and payday lenders and check cashers are hyperconcentrated in communities of color,” said Lisa Rice, president and CEO of the National Fair Housing Alliance. “So people are accessing credit based on the providers that are in close proximity to where they are located right now.”
When people who use payday lenders make payments on time, those lenders do not report those payments to the credit bureaus.
“So you get no lift, you get no positive benefit from accessing credit in those areas,” Rice said. “On the flip side, if you go to a payday lender or a check casher, and you don't pay your credit on time, you get turned over to collections. That negative information from the collections agency does get reported to the credit bureau. It's a really perverse construct and one that feeds into the biased outcomes that we see in credit scoring systems.”
Lansing agrees that the FICO score favors those who have received bank credit in the past.
“It's the Catch-22 that says it's hard to evaluate you for credit if you haven't had credit in the past,” he said. “That's a challenge.”
Online lenders like Upstart and Petal augment credit report and FICO score data with other data that proves responsible behavior, such as records of consumers’ payments of their rent, utilities, cellphone and cable bills.
Lansing says this is the principle behind UltraFICO, a credit score launched in 2019 by FICO, Experian and the Mastercard-owned data aggregator Finicity. The UltraFICO score considers credit report data but also factors in how well consumers manage their money, by analyzing their bank account activity.
“To capture populations who aren't getting into the credit cycle, we have to go to alternative data and alternative scores,” Lansing said. “The idea behind some of our new or more innovative scores is to find ways of identifying responsible behavior that is likely correlated to good repayment behavior.”
UltraFICO is not widely used yet, partly because consumers have to opt for it.
The latest versions of the FICO score, which came out at the end of 2020, are FICO 10 and 10T. These rely on credit bureau data, but 10T (the “T” stands for “trended”) includes data on consumers’ payment and debt history for the previous 24 months, offering a closer look at consumers’ recent behavior.
Today, most banks use the FICO 9 score in their underwriting systems, and it may take time for FICO 10 and 10T to become the norm. When FICO 9 came out, it took four years for it to get to 51% usage among lenders versus FICO 8.
“You have to put the score through testing, you have to approve it with the regulators, you have to build it into your systems, you have to watch it,” Lansing said.
‘Blunt instrument’: FICO cutoffs
A related criticism of banks’ reliance on FICO scores in loan decision models is the widespread use of FICO score cutoffs, especially during a cyclical downturn when they typically raise their cutoffs. So if normally a lender won’t lend to anyone with a score below 680, during a recession, lenders might raise their cutoffs to 700.
“If you look at the bands below those certain FICO scores, they disproportionately contain people of color,” said Laura Kornhauser, CEO and co-founder of Stratyfy, a company whose technology assesses and mitigates bias in algorithms used for purposes like lending; Fairplay.ai’s software also does this. “Unfortunately, Black and Latinx populations disproportionately have lower FICO scores than other racial groups because of systemic inequalities that are baked into our financial system and then baked into the data that FICO uses to create their scores. It's not necessarily how FICO's model is working with the data [that’s the issue]. It's the fact that the actual data is not the source of truth that it should be.”
Lansing acknowledges that lenders use FICO cutoffs, with the full approval, support and acknowledgment of regulators who are focused on the riskiness of the bank portfolios.
“But it’s not a great way to do it because when you go from 680 to 700, there are some good people in there who are getting turned off,” Lansing said. “We wish they weren't, but when you have a blunt instrument, that's what happens.”
FICO data scientists studied the data behind FICO scores to ask: Is it possible that in a recession, some people will perform better than others at a particular score?
What they found is that one person with a 680 FICO score might have credit lines that haven’t been maxed out, but might have a few late payments due to sloppiness. Another person with a 680 might be maxed out, pulling every cent of credit, right up against their limits, but paying all their bills on time.
“In a downturn, which of those two 680s is likely to be able to repay you?” Lansing said. “Sadly, the answer is the first one. So when you're using the ‘blunt instrument’ approach and you just go from 680 to 700, neither of those people are going to get credit.”
They developed a FICO Resilience Index, which rank-orders consumers according to their perceived ability to weather a downturn. For instance, people who have had fewer credit inquiries in the last year, fewer active accounts, lower total revolving balances and more experience managing credit would rank high in the index.
“The lenders like it, because it lets them continue to lend and continue to grow their business, even in a downturn, as opposed to just shrinking and denying credit to everybody,” Lansing said.
Rethinking scores and models
Lansing doesn’t see any need to change the math behind the FICO score. He does see the benefit of using alternative data to make more informed decisions.
“For me, that's really the frontier,” he said. “There's all these reasons why we can now look beyond just a single data set to try to understand who's creditworthy and who's not. We applaud that and we encourage it and we develop scores based on these alternative data sets.”
But there are practical limits, he said.
Some data sets aren’t large enough to apply to a big population, so they don’t help much, Lansing said.
“It's a cost-benefit decision on how much data we're going to bring to this decision,” he said.
Banks are limited in how much they can get creative with alternative data because the industry is so heavily regulated, Smith noted.
“Even if there are data points that we would like to use that we know would drive additional access or provide a clearer picture of somebody's overall financial help or well-being, or ability to repay loans, or even demonstrate income stability, lenders are still subject to rules on whether or not they can use those data points and at what point and when, because we still know that there are opportunities for people to use proxies,” he said.
Blend has begun accepting rent payment data in its underwriting platform, since Fannie Mae approved this. It’s also partnered with the payroll provider ADP for borrower income verification. Income verification for gig workers is in the works, possibly from payment providers like Venmo and Cash App.
Fairness framework
The National Fair Housing Alliance recently introduced a PPM framework (“purpose, process and monitoring”) that lenders can use to audit their data-driven models like credit scoring systems and the FICO score itself.
The PPM framework asks model developers to think about the purpose of their model, and identify any risks it may pose to consumers, institutions or society at large.
“It may be that some models may not need to be developed, or should never have been developed because they're just too systemically risky, and so therefore they need to be shelved and not considered until perhaps a later date when we have better ways to mitigate the harms and the risks that those models present,” Rice said.
An example of this is the use of facial recognition in law enforcement, which many cities have dropped.
The PPM framework includes a “staff profile” meant to encourage diversity. When the team building a model is diverse and well educated on things like fair-lending laws, fair-housing laws and civil rights laws, “they do a much better job in building technology that is safer and fairer and more accurate,” Rice said.
Model developers should be asking questions about the variables being used in a model, the weighting of the variables, whether or not they are really representative of the ultimate consumer data set upon which they're going to be used, Rice said.
They should also analyze the data they’re using.
“People are realizing more and more that the data that we are using to build these systems is biased. It's tainted,” Rice said. “If you're using biased data, you're going to get biased outputs. So examining that data, making sure it's clean, making sure it's representative, making sure that it's fair and accurate is really important.”
The PPM framework can help a bank reexamine a loan system that may have been developed 20 years ago and assess whether it was designed correctly.
“If the answer is no, we didn't build it correctly, then you move on to the next step of, OK, how can we compel the system to be fair, or do we need to start building a new system that is more fair?” Rice said.