The MoEs approach selectively choose which patterns are best suited to answer a question or solve a problem. The fewer patterns needed, the less computation needed, the quicker, cheaper and more efficient the LLM can be.
Although they're designed to mimic human-like conversation and although they seem to be performing some kind of search-like knowledge function; in fact, they're fundamentally sophisticated word prediction engines, where knowledge has been embedded within their statistical manipulation of human language (this recent study showed an LLM encoded deep medical knowledge).
In psychology, the ‘Anthropomorphic bias’ describes the human tendency to ascribe human-like characteristics were in fact none exist. This is what’s happening when you see a face in the knot of a tree. It is this same bias that makes the experience of LLMs so uncanny, so real, makes it…feel… so believable.
This bizarre research into ‘analogous tokens’ is a really clear demonstration of an LLM vulnerability, such that it breaks the anthropomorphic spell. It demonstrates that all words are not words to an LLM, but that they are converted into tokens.
Patterns are found between tokens, not words. It’s not reading, but it’s searching a neural network of tokens. A hiccup of this approach (amid thousands of other examples in the research) is found when ChatGPT-4 simply doesn’t see the word ‘SmartyHeaderCode’. This shows that LLMs really don’t ‘think’ – let alone think like a human – at all.
Take the below image as a good visual example. It’s a slight tangent due to the multi-modal nature of this model (it interprets vision and text), but it helps underscore this inhuman aspect of models.
· A picture has been merged with a graphical axis.
· The LLM asked to interpret it.
· The idiom ‘crossing wires’ seems appropriate because the LLM begins to talk about a trend ‘over time’, which the picture obviously doesn’t show.
These examples of hyper intelligent machines making incredibly ‘dumb’ errors can happen in the wild, but they can also be intentionally provoked into making these kinds of errors.
Imagine a bad actor taking advantage of a discovered vulnerability by using an equivalent trick to make a sidewalk look like a road to an automated vehicle; or a red light like a green; or a stop sign like a speed limit sign.
Their knowledge doesn't stem from understanding or memory, but form something inhumanly new, something in-between. Understanding this difference when working with them will enable you to best understand their limitations and therefore how to install proper guardrails on their use.
Now that we've understood a bit about how these models work, in our next post we’ll explore how they can be controlled and guided.