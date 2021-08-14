How the law got it wrong with Apple Card

Liz O'Sullivan
·10 min read

Advocates of algorithmic justice have begun to see their proverbial “days in court” with legal investigations of enterprises like UHG and Apple Card. The Apple Card case is a strong example of how current anti-discrimination laws fall short of the fast pace of scientific research in the emerging field of quantifiable fairness.

While it may be true that Apple and their underwriters were found innocent of fair lending violations, the ruling came with clear caveats that should be a warning sign to enterprises using machine learning within any regulated space. Unless executives begin to take algorithmic fairness more seriously, their days ahead will be full of legal challenges and reputational damage.

What happened with Apple Card?

In late 2019, startup leader and social media celebrity David Heinemeier Hansson raised an important issue on Twitter, to much fanfare and applause. With almost 50,000 likes and retweets, he asked Apple and their underwriting partner, Goldman Sachs, to explain why he and his wife, who share the same financial ability, would be granted different credit limits. To many in the field of algorithmic fairness, it was a watershed moment to see the issues we advocate go mainstream, culminating in an inquiry from the NY Department of Financial Services (DFS).

At first glance, it may seem heartening to credit underwriters that the DFS concluded in March that Goldman’s underwriting algorithm did not violate the strict rules of financial access created in 1974 to protect women and minorities from lending discrimination. While disappointing to activists, this result was not surprising to those of us working closely with data teams in finance.

There are some algorithmic applications for financial institutions where the risks of experimentation far outweigh any benefit, and credit underwriting is one of them. We could have predicted that Goldman would be found innocent, because the laws for fairness in lending (if outdated) are clear and strictly enforced.

And yet, there is no doubt in my mind that the Goldman/Apple algorithm discriminates, along with every other credit scoring and underwriting algorithm on the market today. Nor do I doubt that these algorithms would fall apart if researchers were ever granted access to the models and data we would need to validate this claim. I know this because the NY DFS partially released its methodology for vetting the Goldman algorithm, and as you might expect, their audit fell far short of the standards held by modern algorithm auditors today.

How did DFS (under current law) assess the fairness of Apple Card?

In order to prove the Apple algorithm was "fair," DFS considered first whether Goldman had used "prohibited characteristics" of potential applicants like gender or marital status. This one was easy for Goldman to pass — they don’t include race, gender or marital status as an input to the model. However, we’ve known for years now that some model features can act as "proxies" for protected classes.

If you’re Black, a woman and pregnant, for instance, your likelihood of obtaining credit may be lower than the average of the outcomes among each overarching protected category.

The DFS methodology, based on 50 years of legal precedent, failed to mention whether they considered this question, but we can guess that they did not. Because if they had, they’d have quickly found that credit score is so tightly correlated to race that some states are considering banning its use for casualty insurance. Proxy features have only stepped into the research spotlight recently, giving us our first example of how science has outpaced regulation.

In the absence of protected features, DFS then looked for credit profiles that were similar in content but belonged to people of different protected classes. In a certain imprecise sense, they sought to find out what would happen to the credit decision were we to “flip” the gender on the application. Would a female version of the male applicant receive the same treatment?

Intuitively, this seems like one way to define "fair." And it is — in the field of machine learning fairness, there is a concept called a "flip test" and it is one of many measures of a concept called "individual fairness," which is exactly what it sounds like. I asked Patrick Hall, principal scientist at bnh.ai, a leading boutique AI law firm, about the analysis most common in investigating fair lending cases. Referring to the methods DFS used to audit Apple Card, he called it basic regression, or "a 1970s version of the flip test," bringing us example number two of our insufficient laws.

A new vocabulary for algorithmic fairness

Ever since Solon Barocas’ seminal paper "Big Data’s Disparate Impact" in 2016, researchers have been hard at work to define core philosophical concepts into mathematical terms. Several conferences have sprung into existence, with new fairness tracks emerging at the most notable AI events. The field is in a period of hypergrowth, where the law has as of yet failed to keep pace. But just like what happened to the cybersecurity industry, this legal reprieve won’t last forever.

Perhaps we can forgive DFS for its softball audit given that the laws governing fair lending are born of the civil rights movement and have not evolved much in the 50-plus years since inception. The legal precedents were set long before machine learning fairness research really took off. If DFS had been appropriately equipped to deal with the challenge of evaluating the fairness of the Apple Card, they would have used the robust vocabulary for algorithmic assessment that’s blossomed over the last five years.

The DFS report, for instance, makes no mention of measuring "equalized odds," a notorious line of inquiry first made famous in 2018 by Joy Buolamwini, Timnit Gebru and Deb Raji. Their “Gender Shades” paper proved that facial recognition algorithms guess wrong on dark female faces more often than they do on subjects with lighter skin, and this reasoning holds true for many applications of prediction beyond computer vision alone.

Equalized odds would ask of Apple’s algorithm: Just how often does it predict creditworthiness correctly? How often does it guess wrong? Are there disparities in these error rates among people of different genders, races or disability status? According to Hall, these measurements are important, but simply too new to have been fully codified into the legal system.

If it turns out that Goldman regularly underestimates female applicants in the real world, or assigns interest rates that are higher than Black applicants truly deserve, it’s easy to see how this would harm these underserved populations at national scale.

Financial services’ Catch-22

Modern auditors know that the methods dictated by legal precedent fail to catch nuances in fairness for intersectional combinations within minority categories — a problem that’s exacerbated by the complexity of machine learning models. If you’re Black, a woman and pregnant, for instance, your likelihood of obtaining credit may be lower than the average of the outcomes among each overarching protected category.

These underrepresented groups may never benefit from a holistic audit of the system without special attention paid to their uniqueness, given that the sample size of minorities is by definition a smaller number in the set. This is why modern auditors prefer "fairness through awareness" approaches that allow us to measure results with explicit knowledge of the demographics of the individuals in each group.

But there’s a Catch-22. In financial services and other highly regulated fields, auditors often can’t use "fairness through awareness," because they may be prevented from collecting sensitive information from the start. The goal of this legal constraint was to prevent lenders from discrimination. In a cruel twist of fate, this gives cover to algorithmic discrimination, giving us our third example of legal insufficiency.

How Twilio is moving beyond a diversity numbers game toward becoming an anti-racist company

The fact that we can’t collect this information hamstrings our ability to find out how models treat underserved groups. Without it, we might never prove what we know to be true in practice — full-time moms, for instance, will reliably have thinner credit files, because they don’t execute every credit-based purchase under both spousal names. Minority groups may be far more likely to be gig workers, tipped employees or participate in cash-based industries, leading to commonalities among their income profiles that prove less common for the majority.

Importantly, these differences on the applicants’ credit files do not necessarily translate to true financial responsibility or creditworthiness. If it’s your goal to predict creditworthiness accurately, you’d want to know where the method (e.g., a credit score) breaks down.

What this means for businesses using AI

In Apple’s example, it’s worth mentioning a hopeful epilogue to the story where Apple made a consequential update to their credit policy to combat the discrimination that is protected by our antiquated laws. In Apple CEO Tim Cook’s announcement, he was quick to highlight a "lack of fairness in the way the industry [calculates] credit scores."

Their new policy allows spouses or parents to combine credit files such that the weaker credit file can benefit from the stronger. It’s a great example of a company thinking ahead to steps that may actually reduce the discrimination that exists structurally in our world. In updating their policies, Apple got ahead of the regulation that may come as a result of this inquiry.

This is a strategic advantage for Apple, because NY DFS made exhaustive mention of the insufficiency of current laws governing this space, meaning updates to regulation may be nearer than many think. To quote Superintendent of Financial Services Linda A. Lacewell: "The use of credit scoring in its current form and laws and regulations barring discrimination in lending are in need of strengthening and modernization." In my own experience working with regulators, this is something today’s authorities are very keen to explore.

I have no doubt that American regulators are working to improve the laws that govern AI, taking advantage of this robust vocabulary for equality in automation and math. The Federal Reserve, OCC, CFPB, FTC and Congress are all eager to address algorithmic discrimination, even if their pace is slow.

In the meantime, we have every reason to believe that algorithmic discrimination is rampant, largely because the industry has also been slow to adopt the language of academia that the last few years have brought. Little excuse remains for enterprises failing to take advantage of this new field of fairness, and to root out the predictive discrimination that is in some ways guaranteed. And the EU agrees, with draft laws that apply specifically to AI that are set to be adopted some time in the next two years.

The field of machine learning fairness has matured quickly, with new techniques discovered every year and myriad tools to help. The field is only now reaching a point where this can be prescribed with some degree of automation. Standards bodies have stepped in to provide guidance to lower the frequency and severity of these issues, even if American law is slow to adopt.

Because whether discrimination by algorithm is intentional, it is illegal. So, anyone using advanced analytics for applications relating to healthcare, housing, hiring, financial services, education or government are likely breaking these laws without knowing it.

Until clearer regulatory guidance becomes available for the myriad applications of AI in sensitive situations, the industry is on its own to figure out which definitions of fairness are best.

Embodied AI, superintelligence and the master algorithm

Our goal is to create a safe and engaging place for users to connect over interests and passions. In order to improve our community experience, we are temporarily suspending article commenting

Recommended Stories

  • Apple Stock Has Lowest Volatility In A Year; How Can You Profit With Options?

    We currently see low implied volatility in big-tech names like Apple stock. Here's an option strategy to profit.

  • Gateway Bronco announces electric line of restomod SUVs

    Gateway Bronco, a company that creates high-end restomods of classic Ford Broncos just announced its new line of electric models. Two versions will be available, the Fuelie Electric and the Luxe-GT Electric, the latter of which is pictured above. Forward motion comes from a 295-horsepower electric motor, which Gateway says will get one of its Broncos to 60 mph in 4.7 seconds.

  • Why some Ford Bronco customers are getting new hardtop roofs

    Ford said the molded-in-color hardtop roofs aren't up to its quality standards, so every customer is getting a brand-new one. But that means some customers might not get their SUV this year at all.

  • Fired Alibaba employee suspected of 'forcible indecency', not rape -police

    A former male employee of Chinese e-commerce giant Alibaba Group Holding Ltd is suspected of committing "forcible indecency" against a female colleague, but not rape, according to Chinese police probing the assault. The police update came after a female employee went public with an 11-page account on Alibaba's intranet saying her manager and a client sexually assaulted her during a business trip, and that superiors and human resources did not take her report seriously. The scandal led to fierce public backlash against Alibaba, which later fired the male employee.

  • The Best Places in Every State To Live on a Fixed Income

    The average Social Security benefit for retired workers in 2021 is $1,548 per month. That comes out to $18,576 in annual benefits for a single person, or $37,152 per year for a couple — about...

  • Spain flirts with record heat; 16 Italy cities on red alert

    Spain endured what it expected to be its hottest day of the year Saturday, with temperatures topping 45 degrees Celsius (113 F), while authorities in Italy expanded the number of cities on red alert for health risks to 16 as a heat wave engulfed Southern Europe. Temperatures in the mid-40s Celsius (113-114.8 F) were forecast for the Sicilian cities of Palermo and Catania, and as high as 37 degrees Celsius (98.6 F) for Rome, Florence and Bologna, all places that the Health Ministry put on red alert.

  • iRobot's high-end Roomba i7+ and S9+ are up to $150 off at Wellbots

    Save up to $150 on high-end Roomba robot vacuums at Wellbots.

  • Twitter reinstates accounts of India's Rahul Gandhi, other opposition leaders

    Twitter unblocked the accounts of India's main opposition party Congress, its leader Rahul Gandhi and other party officials on Saturday, a day after suspending the accounts over a tweet on the alleged rape and murder of a nine-year-old girl. The accounts have been restored after Twitter reviewed Gandhi's submission of formal consent from people depicted in the image, the U.S. social media giant said in an emailed statement, adding that it has withheld the tweet in India.

  • U.S. trade judge finds Google infringed five Sonos patents

    A U.S. trade judge ruled on Friday that Alphabet Inc's Google infringed five patents belonging to Sonos Inc that concern smart speakers and related technology, a decision that could lead to an import ban. The brief ruling from Charles Bullock, the chief administrative law judge of the U.S. International Trade Commission, did not explain why Google's sale of the products violated a 1930 federal tariff law, commonly known as Smoot-Hawley, designed to prevent unfair competition. Sonos has been trying to block Google from importing Home smart speakers, Pixel phones and other products from China.

  • A Peek Into Warren Buffett’s Second-Quarter Stock Moves

    Berkshire Hathaway was a net seller of just over $1 billion of stocks in the period, following almost $4 billion of net sales in the first quarter.

  • 7.2 magnitude earthquake hits Haiti - larger than devastating 2010 quake

    Officials are reporting fatalities and damages. Tsunami warnings have now been issued. US Geological Survey is estimating "high casualties"

  • Summers Says ‘Bizarre’ for U.S. to Borrow So Much in Short-Term

    (Bloomberg) -- Former U.S. Treasury Secretary Lawrence Summers said the Federal Reserve’s massive bond-buying program is resulting in a “bizarre” situation in which the government’s funding structure is overly focused on the short-term.Under its quantitative easing program, the Fed purchases longer-term Treasuries and the money it creates to buy them ends up in the accounts that banks hold with the central bank, in the form of overnight reserves.These reserves earn a rate of interest that’s link

  • This Stock Could Be Headed for a Crash

    Let's face it... the stock market hasn't been normal -- whatever that means -- in a while. Even before the pandemic, the price-to-sales (P/S) ratio of the average stock on the S&P 500 index was 2.

  • BP Bought Up Exxon Stock. It Slashed Stakes in Apple, Chevron, and Microsoft.

    British oil giant BP more than doubled its investment in Exxon stock, and cut stakes in Apple, Chevron, and Microsoft in the second quarter.

  • Senate budget proposal boosts child tax credit by up to $1,600

    The budget proposal seeks to expand the enhanced credits beyond one year, extend who’s covered, and remove all tax liability for those checks.

  • How Much of Your Salary You’ll Need To Spend To Afford a House Across America

    If you're thinking about buying a house this year, there is good news and bad news. Mortgage rates are expected to remain low through in 2021, according to forecasts by real estate marketplace Zillow....

  • Student loans: FOIA data reveals tons of 'underwater borrowers' ahead of repayment cliff

    Federal data reveals that despite the interest-free payment pause allowing borrowers to find some breathing room during the pandemic, two-thirds of nearly 430,000 student loan borrowers were still "underwater" — meaning that they still haven't been able to make a dent on their original balance.

  • Cardano (ADA) Crosses $2 Threshold Ahead of Smart Contract Launch

    ADA has passed the $2 mark, becoming the third-largest cryptocurrency in the world ahead of its proposed smart contracts launch

  • Embraer returns to profit, unveils new turboprop plans

    SAO PAULO (Reuters) -Brazilian planemaker Embraer SA posted its first quarterly recurring profit in more than three years on Friday and took another step toward the development of the first brand-new Western turboprop aircraft in decades. Turboprops are said to be more efficient on shorter trips and are particularly attractive at a time of higher oil prices. Embraer's new concept for the turboprop would feature engines mounted at the rear of the aircraft, an unusual change from the more conventional wing-mounted engines, the company's chief commercial officer, Arjan Meijer, said on Twitter.

  • 2 Big Dividend Stocks Yielding 7%; Analysts Say ‘Buy’

    They used to say that investors should ‘sell in May and go away.’ It was a reference to a historical pattern, long noticed by investors, that markets frequently swooned in the summer months. From May until October, on average, the S&P 500 has registered an average drop of 1.7%. While this loss is usually subsumed by larger full-year trends, it does affect shorter-term investment decisions. LPL Financial’s chief market strategist Ryan Detrick, however, believes that we’re in for a deeper loss thi