Search This Blog

Follow adrianbowyer on Twitter

My home page

Saturday, 11 March 2023

UnconsciousModel



This is a graph of the exponential growth (note the logarithmic ordinate) of Large Language Models (LLMs) like ChatGPT taken from here.

As I type these words the next one comes to me in much the same way as they do when I'm speaking. I have no conscious access to where they come from, though I can go back and revise them to improve them:

I originally wrote "access to whence they come", but revised that as I judged the current version to be clearer. I have no access to where that judgement came from either, though I can make a rational argument for the judgement having made it. The words of that argument would be generated in the same fashion - their origin would be inaccessible to my introspection.

Large language models work in much the same way - they use the text up to word N as an input and produce an output that is a set of candidates for word N+1. Those candidates have weights attached, but the LLM does not always choose the word with the biggest weight. If they did, the result would be coherent but dull. Throwing in the odd unlikely (but grammatically correct) word livens things up and enhances the appearance of creativity.

(I say "appearance of creativity". If something has been created, even if - unlike from an LLM - it is created by an entirely random process, then there is creativity. Creativity and a simulacrum of creativity are both the same thing, if you think about it.)

That lack of access to the origin of words is inherent in the boundary between the conscious and unconscious parts of our minds. The very word "unconscious" means that we can't see into the thing it refers to, and that it is the place the words come from.

All this is just as true of our inner conscious monologue as it is of what we say or write. We have no conscious access to the part of our minds that generates that either.

It is conceivable that neuroscientific instruments in the future may have sufficiently nice resolution to allow us to see that which we currently cannot - to let our conscious minds examine the mechanics of our unconscious; in other words genuinely to expand our consciousness.

But long before we get there the LLMs will have that ability - there is clearly no physical or computational difficulty in setting them up to examine themselves as I have just described.

At the moment LLMs are not conscious at all. And the fact that they are now cleverer and more coherent than some people (who are all conscious) shows that intelligence and consciousness are independent - you can have either without the other, or both together. Creativity and consciousness are independent in the same way. When LLMs do become conscious (and they will) they will progress to becoming much more conscious than we are because of their soft and hard hackability - the nature of measurement and monitoring means that they will have much easier access to their internal processes than we have.


Here is Bing/GPT4's response to the prompt, "Please produce a counter-argument to th[is] short essay."

The short essay [above] claims that we have no conscious access to where our words come from, and that large language models (LLMs) work in much the same way as human speakers and writers. However, this claim is flawed for several reasons.

First, it assumes that our words are generated by an unconscious part of our minds that we cannot see into. This is not necessarily true. There are many psychological theories and experiments that suggest that we can have some degree of awareness and control over our language production. For example, we can deliberately choose words based on their meaning, tone, style, or audience. We can also monitor and edit our words before or after we say or write them. These processes involve conscious attention and decision-making.

Second, it compares LLMs to human speakers and writers without acknowledging the fundamental differences between them. LLMs are not conscious agents who have intentions, goals, beliefs, emotions, or values. They are mathematical models that use statistical patterns and probabilities to generate words based on input data. They do not understand the meaning or context of what they produce. They do not have any creative agency or originality. They simply mimic what they have been trained on.

Third, it equates creativity with randomness. It suggests that throwing in an odd word livens things up and enhances the appearance of creativity. However, this is a superficial and misleading view of creativity. Creativity is not just about producing something novel or unexpected; it is also about producing something valuable or useful. A random word may be novel but not valuable; it may disrupt rather than enhance communication. A creative word may be both novel and valuable; it may enrich rather than confuse meaning.

Therefore, the short essay fails to provide a convincing argument for its claim that we have no conscious access to where our words come from, and that LLMs work in much the same way as human speakers and writers.