How hidden computations dominate our lives
Stephen R. Addison, Ph.D.,
Professor of Physics
Dean, College of Natural Sciences and Mathematics
University of Central Arkansas
IF YOU PERFORM a Google search for the word algorithm, you’ll be presented with many different results, the first of which is a dictionary definition that reads, “…a process or set of rules to be followed in calculations or other problem-solving operations, especially by a computer.” This is a simple definition that doesn’t reflect the importance of algorithms in our everyday lives today. In many ways, algorithms now dominate people’s interactions with the rest of the world—whether or not they themselves use computers. But just as the half-forgotten algorithms of our early education (the one for long division, say) were met with something less than excitement, the governing algorithms of our modern lives have attracted scant interest from most people.
Contrast public perceptions of robots with those of algorithms. Robots have excited the public imagination since at least the release of the movie Forbidden Planet in 1956. Forbidden Planet’s Robby the Robot was many people’s first experience of a seemingly sentient robot, and all future Hollywood robots can be traced back to the pattern established in that film—that is, as a benign technology controlled by fallible humans leading to the ultimate destruction of a civilization.
More recently, in movies like I, Robot and the Terminator series, robots themselves have become malevolent entities. Such books as Martin Ford’s Rise of the Robots: Technology and the Threat of a Jobless Future (Basic Books, 2016) have further heightened fears of a robot-dominated future in which people may become superfluous. Such fears are overwrought, but at this point I should remind you of the modern definition of a robot: A robot is a machine or algorithm that performs tasks formerly performed by people. In other words, your feelings about algorithms and robots should be the same, because an algorithm is a robot.
Today, robots that are machines are replacing industrial workers, and robots that are algorithms are replacing office workers and other knowledge-workers. There are those who argue that the current rate of change in the workplace is unprecedented, but technological progress has continuously increased since the beginning of the Industrial Revolution, and people observing those changes have always marveled at the observed rate of change in their time. Indeed, prior to the Industrial Revolution, change was often so slow as to be unnoticed, and that is the rate of change that our historical perspective accustoms us to expect.
But while slow, incremental change isn’t a feature of our modern data-driven world, this rate of change in the workplace shouldn’t be a worry—throughout history, new jobs have been created at rates faster than old jobs have disappeared. What’s different about today is the need to educate our workforce to be lifelong learners who can readily adapt to the vagaries of an ever-changing economy. In the future, people who spend their working lives employed by a single employer, or even in the same industry, are going to become increasingly rare. Most of us will change careers multiple times over our working lives.
So where do we meet algorithms? Anyone who visits an e-commerce site, or indeed any website, will interact with algorithms, and soon will see advertising related to their browsing activities across all their Internet-connected platforms. People are frequently surprised when they see such advertisements, and often they think their phones are snooping on them. In fact they are, but it’s not through listening to their conversations; it’s through tracking the web searches that they’ve forgotten they performed. Not long ago, I spent a week in Midland, Texas, where I did some Internet searches related to the controls of a car I’d recently purchased. After I returned home to Conway, my smart TV served me advertisements for west Texas car dealers for the entire next week.
We also meet algorithms when we apply for loans or credit cards, or even applying for a job. Many tasks formerly performed by people are now performed by algorithms. Such algorithms frequently combine information supplied by the applicant with anonymized data from groups across different populations. This information is weighted by those who designed or implemented the systems and is used to generate decisions on whether applications should be granted and under what terms, based on experience with similar applicants.
In many cases, decisions are made without human interaction, based on information that was coded into the decision algorithm when it was created. Such decisions have sometimes been described as objective, but these objective decisions are not unbiased—they’re informed by the biases of those who implemented the algorithms, as well as any bias in the datasets that were used to develop the algorithm.
What exactly do we mean when we speak of bias in connection with data and algorithms? When bias arises from data used to develop a model, it’s as the result of using one population to make recommendations about a different population. A well-known example is performing medical testing on a largely male test group and applying the result to both men and women. Treatment regimens that work for men do not necessarily work for women.
More about that in a moment. Meanwhile, let’s explore the idea of bias in an algorithm for selecting high school students for participation in a college-level summer program. Let’s say the decision will be made on a composite score based on five categories, including ACT score, GPA, class rank, letters of reference, and a statement of interest. For our illustration, let’s say that we give each category a separate score and then multiply the results by a weighting factor so that our maximum composite score is 100. If we multiply each score by four, the categories will be equally weighted—and our selected participants will not mirror the population. It’s well known that some demographic groups—students at large, affluent schools with many educational opportunities, for example—score higher on the ACT than do students at small rural schools with a more restricted curriculum and fewer advanced placement classes.
But we can adjust the demographics of the invitees by changing the weighting factors in our selection algorithm. As an example, we might multiply GPA and reference letters by six, leave the letter of interest at four, and multiply class rank and ACT score by two. The total is still 100, but the demographics of the invitees will not be the same. Using a higher weighting factor on GPA and letters of reference will bias the results in favor of students from smaller schools. This is a simple example, but one that could be used to ensure that invitees are more representative of the population as a whole in a program. An algorithm actually used for this purpose would most likely include additional factors and a more complex weighting system, but this example serves to show how an algorithm can be biased.
Let’s examine another situation that illustrates the action of algorithmic bias. I frequently shop for used books using Internet sites. Such sites aggregate the holdings of bookstores across the world and provide a convenient way of matching books for sale with customers who wish to find and purchase those particular books. I would like the algorithm to be biased to balance cost, the condition of the book, the reliability of the seller, and the proximity of the seller, so that I quickly receive a book in good condition, at a reasonable cost.
But the aggregator has a different agenda: He wants to retain both me as a customer and his network of sellers, all while maximizing his long-term profits. Those interests aren’t necessarily complementary, and as a result I often search beyond the recommended copy. Many people simply follow the recommendations. In an ideal world, we would be able to control the weighting of such algorithms so that the recommended copy would be the one that everyone would select, but we can’t. Recommendation algorithms will always work in the best interests of the aggregator.
THE BIAS INHERENT in such sales algorithms causes no long-term harm, but some algorithmic biases do cause actual harm. I’m talking about gender bias. Women are underrepresented in computer science occupations, and they’re also underrepresented in many of the databases that are used to design and implement systems for global populations.
Computing emerged during World War II, and women made up the majority of its workforce. This persisted into the 1960s. By 1973, however, only 13.6 percent of bachelors’ degrees in computer science were awarded to women. This number grew to 37 percent by 1984, but fell to 18 percent as the number of homes with personal computers surged with marketing campaigns focused on men. The representation of women in the computer science workforce remains at 18 percent today.
As well as being underrepresented in the workforce, women are grossly underrepresented in data sets. This underrepresentation can have catastrophic consequences. Until recently, clinical trials have largely consisted of male participants, with women of child-bearing age excluded by design. This means that women suffer when drugs come on the market, because they’re now being used on populations that weren’t included in the trials, and drugs don’t necessarily have the same effect on men and women. Nor do men and women necessarily exhibit the same symptoms when suffering from the same illness, leading to incorrect diagnoses and treatment.
Additionally, women live in a world in which design algorithms are based on standard people who, until recently, have largely been men. In her highly readable and well researched book Invisible Women: Data Bias in a World Designed for Men (Abrams Press, 2019), author Caroline Criado Perez has succinctly described this situation as “one size fits men.” The effects of living in a world designed for men begin with many women simply not being able to reach the top shelf of cabinets, and extend all the way to life-threatening chances of experiencing a fatal injury in vehicular accidents in cars with safety features designed to fit men.
So when we think of the algorithms of data science, we must always be aware of any bias in the construction of an algorithm—and, to be clear, algorithms that we use in decision-making are often biased. Some bias is necessary—for example, the bias in such algorithms must be the bias needed for us to make appropriate decisions. But we must work to ensure that our algorithms aren’t contaminated with unintentional bias. We must also ensure that the datasets that we use to construct algorithms are representative of the population to which the algorithms will be applied. If the population changes, we need to consider whether or not the algorithms used to make decisions about that population need to be modified or replaced.
There are many other things that we should think about regarding algorithms in this modern life of ours, and in the future I may well explore more of them. For the time being, however, I hope that the thoughts and examples I’ve presented here will stimulate your own thoughts and investigations. Recognizing the problems with underrepresentation of various groups in the computer science profession, and their larger and unexpected consequences, I also hope that I’ve challenged this audience to work to make a difference.