Search This Blog

Follow adrianbowyer on Twitter

My home page

Tuesday, 25 April 2017

TTestComplete



A while ago the British Government was silly enough to allow me onto committees to decide how to spend millions of taxpayers' money on scientific and engineering research.  They even had me chair the meetings occasionally.

We'd get a stack of proposals for experiments, together with peer-review reports from other people like me on whether the experiments were worth doing or not.  The committees' modi operandorum were to put the proposals that the reviewers said were best at the top of the pile then work down discussing them and giving their proposers the money they wanted until the money ran out.

I liked to cause trouble by starting each meeting with my explanation of why this approach is All Wrong.

"The ones we should put at the top of the pile," I said, "are the ones where half the reviewers say 'Brilliant!' and the other half say 'Rubbish!'.  Those are the proposals that nobody knows the answer to, clearly.  So those are the experiments that are most important."

The other academics there would smile at me indulgently because of my political naivety.  The civil servants would smile at me nervously in case any of my fellow academics actually decided to do what I proposed.  And then everyone would carry on exactly as they had always done.

After a while I started saying no when I was asked to attend.

---o---

There has been an understandable fuss recently prompted by some good research by my erstwhile colleague Joanna Bryson and others about algorithmic racism - that is to say things like Google's autocomplete function giving the sort of results you can see in the picture above.

Google's (and other's) argument in defence of this is a strong one.  The essence of it is that their systems are driven by their user's preferences and actions; they gather the statistics and show people what most other people want to see when those other people do the same as you do.  The results are modified sometimes from "most other people" to "most other people like you" where "like you" is again the result of a statistical process.  If most other people are racist, historically ignorant cretins, then you will see results suitable for racist, historically ignorant cretins.  They (Google and the rest) are not like newspaper editors deciding what to put in front of people; they are just reflecting humanity back at you, you human you.

But you can see from the picture that the results of this are sometimes very bad, by almost any sensible moral definition.

Clearly what is needed is not the intervention of an editor - that would result in Google, Facebook and the rest turning into the New York Times or the Daily Mail, which would be a retrograde step, not an improvement.  What is needed is an unbiased statistical process that weights searches, hyperlinks and the rest from clever people more heavily than those from stupid people.

Note that I'm not saying that clever people aren't racists, and that stupid people are. I suspect that there is not that good a correlation, though this is interesting.  I'm just saying that in general all the web's automated linking and ranking systems ought to work better if they weighted the actions of people by their intelligence.

But how to grade the intellectual ability of web users?  The answer lies in the big data that all the web companies already use.  Facebook, for example, has a record of billions of people's educational achievements.  More interestingly it should be simple to train a neural network to examine tweets, blog posts and so on and to correlate their content with that educational data.  That network would then be able to grade new people and those who hadn't revealed any qualifications just by reading what they say online and apply weights accordingly.

I have no idea if this is a good idea or not.  It is my idea, but I'm not intelligent enough...


Thursday, 12 January 2017

HardClever


By now there must be a lot of people who actually believe that little glowing lights move along their axons and dendrites when they think, flashing at the synapses.

Anyway.

There has been a lot of fuss about AI lately, what with Google translate switching over to a neural network, rich people funding AI ethics research, and the EU trying to get ahead of the legislative curve.  There has also (this is humans in conversation after all...) been a lot of stuff on the grave dangers to humanity of super intelligent AIs from the likes of Stephen Hawking and Nick Bostrom.

Before we get too carried away, it seems to me that there is one very important question that we should be investigating.  It is: What is the computational complexity of general intelligence?  Before I say how we might find an answer, let me explain why this is important by looking at the extremes that that answer might take.  

At one end is linear complexity.  In this case, if we have a smart computer, we can make it ten times smarter by using a computer that is ten times bigger or faster.

At the other end is exponential complexity.  In this case, if we have a smart computer, we can make it ten times smarter only by having a computer that is twenty-two-thousand times bigger or faster.  (That is e10 times bigger or faster; there may be a factor in there too, but that's the essence of it.)

If smart computers do really present a danger, then the linear case is bad news because the machines can easily outstrip us once they start designing and building themselves and it is quicker to make a computer than to make a person.  In the exponential case the danger becomes negligible because the machines would have great difficulty obtaining the resources to make smarter versions of themselves.  The same problem would inhibit us trying to make smarter machines too (or smarter people by genetic engineering, come to that).

Note, in passing, that given genetic engineering the computers have no advantage over us when they, or we, make smarter versions of themselves or ourselves.  The computational complexity of the problem must be the same for both.

The big fuss about AI at the moment is almost all about machine learning using neural networks.  These have been around for decades doing interesting little tricks like recognising printed letters of the alphabet in images.  Indeed, thirty years ago I used to set my students a C programming exercise to make a neural network that did precisely that.

Some of the computational complexity of neural-net machine learning falls neatly into two separate parts.  The first is the complexity of teaching the network, and the second is the complexity of it thinking out an answer to a given problem once it has been taught.  The computer-memory required for the underlying network is the same in both cases, but the time taken for the teaching process and the give-an-answer process are different and separable.

Typically learning takes a lot longer than finding an answer to a problem once the learning is finished.  This is not a surprise - you are a neural network, and it took you a lot longer to learn to read than it now takes you actually to read - say - a blog post.

The reason for the current fuss about machine learning is that the likes of Google have realised that their big-data stores (which are certainly exponentially bigger than the newsprint that I used to give my students to get a computer to read) are an amazingly rich teaching resource for a neural network.

And here lies a possible hint at an answer to my question.  The teaching data has increased exponentially, and as a result the machines have got a little bit smarter.

On the other hand, once you have taught a neural network, it comes up with answers (that are often right...) to problems blindingly fast.  The time taken is roughly proportional to the logarithm of the size of the network.  This is to say that, if a network takes one millisecond to answer a question, a network twenty-two-thousand times bigger will take just ten milliseconds.

But the real experiments to find the computational complexity of general intelligence are staring us in the face.  They lie in biology, not in computing.  Psychologists have spent decades figuring out how smart squirrels, crows, ants, and all the rest are.  And they have also investigated related matters like how fast they learn, and how much they can remember.  Brain sections and staining should allow us to plot a graph of numbers of neurons and their degree of interconnectivity against an ordering of smartness of species.  We'd then be able to get an idea if ten times as smart requires ten times as much brain, or twenty-two-thousand times as much, or somewhere in between.

Finally, Isaac Asimov had a nice proof that telepathy doesn't exist.  If it did, he said, evolution would have exploited and refined it so fast and so far that it would be obvious everywhere.

We, as the smartest organisms on the planet, like to think we have taken it over.  We have certainly had an effect, and now find ourselves living in the Anthropocene.  But that effect on the planet is negligible compared to - say - the effect of phytoplankton, which are not smart at all.  And our unique intelligence took three billion years to achieve.  This is a strong indication that it is quite hard to engineer, even for evolution.

My personal guess is that general intelligence, by which I mean what a crow does when it bends a wire to hook a nut from a bottle, or what a human does when they explain quantum chromodynamics, will turn out to be exponentially hard.  We may well get there by throwing exponential resources at the problem.  But to get further either the intelligent computer, or we, will require exponentially more resources.