“Data Sciences and Society” Reading Group
For this meeting, we read two articles  by Sanmay Das, both of which tackled the issue of how to improve the efficiency of the allocation of homelessness interventions. Dr. Das is interested in the application of optimization onto social issues, from matching markets to social networks to finance. These two articles focusing on homelessness, a topic we discussed earlier in the semester after reading Automating Inequality, gave us a welcome opportunity to see what was inside the algorithms trusted to govern social life, so often black-boxed in social analyses.
At the start, Dr. Das offered a brief overview of the two papers but concentrated on describing the case study on Homelessness Services in a metropolitan area. In addition to describing the technical issues such as the way data were collated or the model of complexity developed and the features of the algorithms for alternatives in allocation of scarce resources there was a fair amount of discussion in his summary of the issues relating to ethics and fairness and the central role of what he called human interpretability. Canay Özden-Schilling provided a detailed comment on the papers followed by a lively discussion in which various questions were fielded by Das.
What we see when we open the black box of resource allocation algorithms is a matching system. In this paper, Das and his collaborators (from here on only “Das” for simplicity) work with data acquired from the homelessness services of a major metropolitan area. The algorithm currently in effect matches the specific interventions that alleviate homelessness (in this case five interventions ranging from simple prevention methods to the most comprehensive “permanent housing” tool) with a heterogenous population with diverse needs. But is the system accomplishing what it has set out to do? Is it, in fact, reducing homelessness to the extent possible? Das measures this by asking a counterfactual question: would the outcome have been better if different households were matched with different interventions? What the paper does is a reshuffling of cards—simulating different matching scenarios and evaluating the aggregated numbers of the homeless in each hypothetical case. The proxy for continued homelessness is re-entry into the homeless system within two years. The proxy for how a household would behave as a response to a different intervention is how other households with similar characteristics have historically behaved as a response to that particular intervention. Running a simulation with these proxies and a better optimization method, the reentry to homelessness does indeed drop to 37% from the actual 43%.
Put simply, the current system is offering too much assistance to someone who needs only a little and offering too little to someone who needs more. There is, Das argues, a huge number of households that could be helped by tweaking the algorithm for better matches. He adds, “The right approach is then to specify appropriate optimization goals, arrived at through the social processes of policy-making, which could be based on both efficiency and equity considerations.” The juxtaposition of efficiency and equity strikes me in this formulation. In our discussion on Automating Inequality, we dwelled on how some algorithmic tools conflate past marginalization with future high-risk status—hence perpetuating a feedback loop where marginalized people get more marginalized. That is to say, we talked extensively about bias, but perhaps not enough about efficiency, even though the two seem to be closely linked. Elsewhere in the paper, Das relays a striking finding—that when the inefficiencies are fixed in the allocation system, those who are helped more seem to be “those who stand out as being more in need.” Then I have to wonder: is inefficiency a form of bias? Is bias a form of inefficiency? How and where does bias occur separately from inefficiency?
This also gets back to one of the questions our reading group asked in our earlier discussion: is bias a factor of the design of algorithms or a factor of their implementation? The problem may very well be at the design level, where allocation designers choose to collect information on certain variables in trying to assess (e,g., the creditworthiness of a household) or in the way they code these variables (e.g., based on their assumptions about what kinds of living, housing, and parenting are proper). It seemed to me that in the two papers we read, the implication was that the inefficiency problem occurred at the level of implementation instead, since Das worked with the same design and data as provided to him by the homelessness service. Collecting different data would require a new set of eyes—new questions to ask the population, hence new qualitative research. This brings me to my next question: how would Das’s quantitative work, which improves upon the system’s existing quantitative approach, interface with qualitative research?
For instance, Das describes a very interesting instance of making an adjustment to his optimization algorithm for equity and fairness purposes. As a result of optimizing the allocation, Das suspects during the simulations, some households might have moved too much down the ladder of help, which would create an undue fairness issue. To correct for that, Das goes back to the algorithm to add a constraint for how much a singular household can move up and down as a result of the optimality adjustment. This struck me as a fundamentally qualitative kind of work on Das’s part—and endeavor to ask whether the quantitative work has fairness consequences that the algorithm may have been blind to. The questions, then, are compounded: is qualitative work reserved for human eyes that need to keep watch on harmed groups and constantly add constraints to the algorithm? Can we teach the machine to detect equity issues? If we are able to do so, doesn’t the defining and teaching of equity still constitute qualitative work? Is Das’s example getting to an answer to the question of how we fix bias—constant monitoring of the algorithm both qualitatively and quantitively, human and machine alike?
Das’s papers sing the praises of fixing small inefficiencies. As he puts it, “Small efficiencies in keeping people housed yield disproportionately large reductions in homelessness.” I am struck by the humility of this statement, how it presents the optimizer’s work as important and modest at the same time—not always the language we encounter in the worlds of data, machine learning, and algorithms. But this, I can’t help but notice, stands at odds with the assured title of the same paper, “Solving Homelessness.” The way I use “solve” in everyday life might not be the same as, for instance, “solving for x” in mathematics—or perhaps it is. In any case, this makes me wonder: can optimization really solve homelessness (or solve for homelessness)? Is there a place where inefficiency-fixing cannot go?
Veena’s discussion notes:
Das responded to these fascinating issues by first acknowledging that to speak of solutions of homelessness is full of problems citing one of his students who was asked if the problem of homelessness could be solved by 2020; she replied that under a particular definition of homelessness one could claim that but then new problems will come rushing in 2021. The further point Das made was that it is becoming increasingly clear that there are different notions of fairness and it is mathematically impossible as the various impossibility theorems show, to reconcile these different notions of fairness into one grand theory. For example statistical fairness might conflict with fairness to the individual. Hence human deliberation and judgement is key to understanding how to make the debate on fairness and justice in the case of homelessness operative in a given contest. Finally, one of the points Das emphasized in response was that whichever interventions you make, there will be some people who will be adversely affected by the intervention. So there is a central role for rights to appeal and efforts to modify the algorithms in view of the actual experiences of hose adversely affected. The final decision on particular cases can only be taken by those who are actually working with the homeless. This is why Das explained that he likes to work with those who have genuine stakes in the problem – transplant surgeons in cases of algorithms for kidney matches; case workers and public health specialists for working on the homeless and so on.
In the question & answer session that followed, there was discussion around four main issues. First, how does one take care of the bias in the data given that the data on the extent and type of homelessness was filtered through the case worker’s decisions. Second, what was the rationale for taking a two-year duration and would it affect the findings if the duration was reduced to one year or extended to three years, for instance. Third what was the role of counterfactuals in the model—were these equivalent to the thought experiments in philosophy that were useful for clarifying a thought? Fourth, what kinds of systematic changes was the algorithm suggesting? Was there some way to identify specific types of households whose outcomes were improving?
Das responded that, indeed, it had taken them a whole year to clean the data and that the data available from these records links homeless service records with requests for assistance through a regional homeless hotline. By using the administrative data on a weekly basis for 166 weeks and using counterfactual data to ask if a household would have reentered the homeless system within 2 years, they found that their model was well-calibrated. They used a two-year period because using a one-year period generated data that was very noisy while a three-year period had too many variables. The paper, Das said, was in the nature of a proof of concept and a case study, meant to generate further discussions of fairness and ethical issues and long-term dynamics of systems that use this kind of predictive modules. At the same time, he said, since current practices of allocation into different kinds of housings were not evidence-based, there was need to have widespread discussion of these kinds of issues among different constituencies. On the question of the ability of the algorithm to identify specific types of households that were improving Das responded that their initial attempts to find such households did not yield a “nice” characterization and he took the suggestion that they might want to do a baseline comparison with a random allocation to see how different outcomes would be under the current mechanism from a random one. There was some general discussion of how qualitative methods might be added to these models and Das responded that they did plan to interview caseworkers but not until they had very well-defined questions and they could generate some resources to help with the work of the caseworkers (e.g providing money for additional personnel for the hotline which was facing budget cuts). The caseworkers were very overworked and often very stressed with the pressures of work. But his colleagues and he would love to see some ethnographies of how case workers actually made decisions on allocations—what were their thought processes?
Overall, the papers generated a very lively discussion across boundaries of various disciplines showing that faced with urgent societal issues, different kinds of methodologies and theoretical preoccupations can be effectively calibrated to address issues of ethics.