Researchers Build Framework To Avoid Machine Learning’s “Undesirable Outcomes”
By Benjamin Ross, Senior Editor, AI Trends
Researchers at Stanford and the University of Massachusetts Amherst have introduced a framework for designing machine learning (ML) algorithms that make it easier for potential users to specify safety and fairness constraints. Details of the framework were recently published in Science (DOI: 10.1126/science.aag3311).
According to the paper’s authors, current machine learning algorithms “often exhibit undesirable behavior, from various types of bias to causing financial loss or delaying medical diagnoses.” What’s worse, the burden of avoiding these pitfalls often falls on the user of the algorithm and not the algorithm’s designer.
The framework “allows the user to constrain the behavior of the algorithm more easily, without requiring extensive domain knowledge or additional data analysis,” the author’s write, which shifts the burden of ensuring that the algorithm is well-behaved from the user of the algorithm to the designer of the algorithm.
As machine learning algorithms have an increasing impact on society, the paper’s authors argue it is important to establish safeguards that will prevent these “undesirable outcomes.” If these outcomes can be defined mathematically, both users and the algorithm can learn how to navigate away from them.
In an official statement, Philip Thomas, an assistant professor of computer science at the University of Massachusetts Amherst and first author of the paper, explains that the framework makes it easier to ensure fairness and avoid harm for a wide range of industries. It does so by generating “Seldonian algorithms,” an allusion to Hari Seldon, a fictional character created by science fiction writer Isaac Asimov. In a particular story, Seldon develops an algorithm that allows him to predict the future in probabilistic terms.
Thomas writes that the framework is a tool that guides researchers to create algorithms that are easily applied to real-world problems.
“If I use a Seldonian algorithm for diabetes treatment, I can specify that undesirable behavior means dangerously low blood sugar, or hypoglycemia,” Thomas states. “I can say to the machine, ‘while you’re trying to improve the controller in the insulin pump, don’t make changes that would increase the frequency of hypoglycemia.’ Most algorithms don’t give you a way to put this type of constraint on behavior; it wasn’t included in early designs.”
The framework works in three steps. First, it defines the goal for the algorithm design process. Second, it defines the interface that the user will use. Third, the framework creates the algorithm.
In order to show viability, the researchers designed regression, classification, and reinforcement learning algorithms using the framework.
As a test for their framework, the paper’s authors applied generated algorithms to a data set of 43,000 students in Brazil, predicting students’ grade point averages (GPAs) during their first three semesters at university on the basis of their scores on nine entrance exams, using a sample statistic that captures sexism as a form of discrimination.
Results showed that commonly used regression algorithms designed using the standard ML approach can discriminate against female students when applied without considerations for fairness. “In contrast, the user can easily limit the observed sexist behavior… using our Seldonian regression algorithm,” the authors write.
Thomas and his co-authors hope their framework will open up new avenues for ML application.
“Algorithms designed using our framework are not just a replacement for ML algorithms in existing applications,” Thomas and co. write. “It is our hope that they will pave the way for new applications for which the use of ML was previously deemed to be too risky.”
Learn more at Science (DOI: 10.1126/science.aag3311).