Oversimplification: TTestComplete

A while ago the British Government was silly enough to allow me onto committees to decide how to spend millions of taxpayers' money on scientific and engineering research. They even had me chair the meetings occasionally.

We'd get a stack of proposals for experiments, together with peer-review reports from other people like me on whether the experiments were worth doing or not. The committees' modi operandorum were to put the proposals that the reviewers said were best at the top of the pile then work down discussing them and giving their proposers the money they wanted until the money ran out.

I liked to cause trouble by starting each meeting with my explanation of why this approach is All Wrong.

"The ones we should put at the top of the pile," I said, "are the ones where half the reviewers say 'Brilliant!' and the other half say 'Rubbish!'. Those are the proposals that nobody knows the answer to, clearly. So those are the experiments that are most important."

The other academics there would smile at me indulgently because of my political naivety. The civil servants would smile at me nervously in case any of my fellow academics actually decided to do what I proposed. And then everyone would carry on exactly as they had always done.

After a while I started saying no when I was asked to attend.

---o---

There has been an understandable fuss recently prompted by some good research by my erstwhile colleague Joanna Bryson and others about algorithmic racism - that is to say things like Google's autocomplete function giving the sort of results you can see in the picture above.

Google's (and other's) argument in defence of this is a strong one. The essence of it is that their systems are driven by their user's preferences and actions; they gather the statistics and show people what most other people want to see when those other people do the same as you do. The results are modified sometimes from "most other people" to "most other people like you" where "like you" is again the result of a statistical process. If most other people are racist, historically ignorant cretins, then you will see results suitable for racist, historically ignorant cretins. They (Google and the rest) are not like newspaper editors deciding what to put in front of people; they are just reflecting humanity back at you, you human you.

But you can see from the picture that the results of this are sometimes very bad, by almost any sensible moral definition.

Clearly what is needed is not the intervention of an editor - that would result in Google, Facebook and the rest turning into the New York Times or the Daily Mail, which would be a retrograde step, not an improvement. What is needed is an unbiased statistical process that weights searches, hyperlinks and the rest from clever people more heavily than those from stupid people.

Note that I'm not saying that clever people aren't racists, and that stupid people are. I suspect that there is not that good a correlation, though this is interesting. I'm just saying that in general all the web's automated linking and ranking systems ought to work better if they weighted the actions of people by their intelligence.

But how to grade the intellectual ability of web users? The answer lies in the big data that all the web companies already use. Facebook, for example, has a record of billions of people's educational achievements. More interestingly it should be simple to train a neural network to examine tweets, blog posts and so on and to correlate their content with that educational data. That network would then be able to grade new people and those who hadn't revealed any qualifications just by reading what they say online and apply weights accordingly.

I have no idea if this is a good idea or not. It is my idea, but I'm not intelligent enough...

Oversimplification

Tuesday, 25 April 2017

TTestComplete

1 comment: