top of page

Can a Computer Feel the World? by Dr. Timothy Smith

Photo Source: Unsplash


Despite all the hype and spending on AI today, some feel we need a course correction, looking beyond the current obsession with large language models to new ones called “World Models.” In a few weeks, on November 30th, ChatGPT will turn three years old. The launch of ChatGPT transformed artificial intelligence from a fringe concept to a completely mainstream product and launched an AI arms race. AI refers to the science of teaching computers to think and solve problems. ChatGPT and other large language models (LLMs), such as Gemini and Claude, can read and write text that feels very human. The LLMs devour vast amounts of text from books, websites, and articles to learn language patterns. These models are trained using a technique that learns patterns in language by reading millions of documents and randomly removing words from sentences. The model then tries to guess the correct missing word. When the program guesses correctly, it receives a reward that helps it make a similar guess in the future. This method of model building requires vast amounts of data and computing power, and it appears that larger models produce better results. Because of this, the big tech companies, including Alphabet, Microsoft, Amazon, and Meta, embarked on a $400 billion spending spree to build the infrastructure to build larger LLMs than ever. (wsj.com) However, some scientists, like Yann LeCun, believe that LLMs cannot make computers truly smart. LeCun, one of the pioneers of AI, argues that we need something called a “world model” instead.

 

A world model (WM) differs from a language model in that it focuses on more than words. Instead, WMs try to understand how the real world works by analyzing video and three-dimensional information. The word gravity derives its meaning from experience. You know gravity from feeling its effects if you fall or accidentally drop a fragile porcelain figurine. The pain of a fall or the shattering of the figurine tells you about the world and gravity in a much more complete way than a word can. A world model would let a computer understand such things. It would give the computer a mental picture of objects, spaces, and how they move or change over time. In other words, a world model begins to feel and anticipate the physical world more like a person does.

 

World models could help computers and robots plan better. If a robot wants to cook dinner, it needs to know that a pot holds water, that water boils when heated, and that pasta gets soft when cooked. A world model could give the robot that kind of knowledge. Moreover, world models could help reduce mistakes. For example, LLMs sometimes “hallucinate,” which means they make up facts that sound real but have no basis in reality. A world model would rely on logic and physics, reducing the likelihood of hallucination. As humans, we imagine, predict, and reason in part based on learned experience of the world. World models aim to give computers those same skills.

 

LLMs have their own strengths. They are easy to train because they only need text, which is simple to collect. They are flexible and can be used in many areas, from writing essays to helping businesses. They are also improving quickly, with companies making bigger and better versions all the time. Most importantly, they are available now so that people can use them right away.

 

The primary weakness of LLMs stems from the fact that they really do not understand the meaning of words in the real world. They cannot plan or predict physical events, like how a figurine will change when it falls to the floor or how water boils. LLMs sometimes make up facts, which can be confusing or even dangerous if someone relies on them too much. And just making LLMs bigger does not guarantee that they will become truly intelligent. For example, the statement “The frigid wind bit her cheeks as she stepped outside into the storm.” An LLM could construct such a statement, but not feel it. LLMs can talk, write, and explain things in ways that sound natural. But they do not know what those words really mean. World models may eventually act like future scientists. They might one day understand the world enough to help robots act in it. Both are important, but they serve different purposes. LLMs are practical today, while world models are a vision for tomorrow.

 

Yann LeCun believes that LLMs are a “dead end” for building true intelligence. He says that world models provide a path forward because they enable computers to reason, plan, and act like humans. But it may take years of research before world models become practical tools for everyday use. In the meantime, LLMs will keep improving and helping people with language and some reasoning tasks. The future may combine both approaches: LLMs for words and world models for reasoning. Together, they could make AI much smarter and more useful.

 

In the end, the debate between world models and LLMs is about what kind of intelligence we want computers to have. Do we want them to be clever with words, or do we want them to understand the world around them? LLMs continue to change how we work and learn, but world models could change how machines interact and feel the physical world. Both paths pose exciting opportunities, and both will likely play a role in the future of AI.






Dr. Smith’s career in scientific and information research spans the areas of bioinformatics, artificial intelligence, toxicology, and chemistry. He has published a number of peer-reviewed scientific papers. He has worked over the past seventeen years developing advanced analytics, machine learning, and knowledge management tools to enable research and support high-level decision making. Tim completed his Ph.D. in Toxicology at Cornell University and a Bachelor of Science in chemistry from the University of Washington.


You can buy his book on Amazon in paperback and in kindle format here.








 

 


 




 
 
 

Comments


bottom of page