Leiter Reports: A Philosophy Blog

News and views about philosophy, the academic profession, academic freedom, intellectual culture, and other topics. The world’s most popular philosophy blog, since 2003.

  1. OLP's avatar
  2. Roger Albin's avatar
  3. James Bondarchuk's avatar
  4. Gregory Slack's avatar
  5. John Rapko's avatar

    The image next to Wittgenstein is actually John Turturro saying ‘If pasta could talk, I’d understand it’.–On a lighter note:…

  6. F.E. Guerra-Pujol's avatar
  7. Adam Shear's avatar

    And the image of eyeglasses in the linguistic turn panel are not eyeglasses. (oh wait, I thought we were playing…

AI is “dumber than a cat”

MOVING TO FRONT FROM OCTOBER 16–LOTS OF COMMENTS WORTH READING!  VERY INSTRUCTIVE!

So says this "AI pioneer," Yann LeCun:  "LeCun thinks that today’s AI models, while useful, are far from rivaling the intelligence of our pets, let alone us. When I ask whether we should be afraid that AIs will soon grow so powerful that they pose a hazard to us, he quips: 'You’re going to have to pardon my French, but that’s complete B.S.'"

What say you informed readers?  (Post a comment with a full name and valid email address. And don't post unless you know something about these issues:  I'd like to be educated!)

Leave a Reply

Your email address will not be published. Required fields are marked *

28 responses to “AI is “dumber than a cat””

  1. There are new papers arguing that LLMs cannot formally reason, e.g.: https://arxiv.org/pdf/2410.05229

    However, LLMs may still be a "threat" to "us" because even if they cannot reason, they will steadily erode the need for employees. LLMs can compose text/summarize/translate quickly and well enough for many business purposes.

    This is the problem that academics overlook: businesses don't need perfection, they just need good enough, and so even if academic works show unreliable or problematic outputs, these may be within most companies risk appetite.

    Yet, my experience is that companies are desperate to use LLMs, but often find that they cannot deploy them in production environments. The reasons vary but they include not having enough data, error rates that are high enough to create problems, not having a use case that solves enough customer interactions to justify reductions in employees…

  2. I'm a professional data engineer and would prefer to remain anonymous but I'm happy to privately verify with Dr. Leiter if he'd like. Also, I'm not a philosopher, I just have an interest in philosophy, so please forgive me if I don't express myself in the right ways.

    I agree with LeCun's point about intelligence, as far as things currently stand, but I'm not quite as confident about AI not posing a hazard to us. I think the path toward powerful AI will have at least as much to do with making humanity weaker, more homogeneous, and more easily manipulated as it does with improving AI models themselves. There are a couple trends that I find worrying and that, if left unchecked, could lead to a world where people really are ruled by AI:

    1. As the US government considers regulation on the creation and use of AI, they're mostly just consulting tech leaders. This seems like asking coal company bosses to come up with the best ways to protect the environment. The end result is going to be greater consolidation of the power that comes from these models. I don't see why this trend would stop, since politicians listen to people with money and the tech companies have money while the people critical of tech companies don't.
    2. Mass surveillance by governments and corporations is continuing to grow. As data collection becomes a part of everything that people are doing, the potential for more powerful models increases, making it easier to predict and control people's behavior. I don't see why this trend would stop either. I occasionally hear people voice some worries about this but the prevailing attitude seems to be that you'd have to a paranoid conspiracy theorist to worry at all about this.

    I think that these trends strengthen each other and lead to a vicious cycle where a smaller and smaller group of people have access to an increasing amount of data with increasingly powerful AI that are able to effect an increasingly conformist public, and an increasingly conformist public makes it even easier to consolidate more power, build better datasets, and build better models. I worry about tech companies and intelligence agencies creating a world where we all happily wander into what today would be considered a nightmare but we do it anyway because of the convenience and addictive nature of new technologies and the pleasure they bring us. Hopefully I'm just paranoid but I really don't think so.

    Also, just from my observations talking to other data engineers, data analysts, and data scientists, I think most data professionals are on a similar page in thinking that AI isn't going to have godlike powers but it definitely could ruin a lot of things for us.

  3. Just a note about making clear exactly what the question is.

    Distinguish three quite different questions:

    1) Does AI (or more particularly LLMs) pose dangers *because it will soon become superintelligent and will be able to out-think us*?

    2) Does AI (or more particularly LLMs) pose dangers *because it will be used by bad actors for bad purposes*?

    3) Does AI (or more particularly LLMs) pose dangers *because it is completely unintelligent and if put in charge of important things is apt to make bone-headed mistakes that no slightly intelligent person would make*?

    I am with LeCun on 1: Present AI systems are not very intelligent when compared to cats, and LLMs in particular are not intelligent at all. They can produce output that looks as if it were written by an intelligent agent, but only because it has been trained on a huge data-base of texts that were written by intelligent agents. Other, more traditional, forms of AI do do something like syntactical "reasoning" or "inference", but are certainly not about to become "superintelligent" in any sense even if they do it well and quickly. A hand calculator can (in a sense) do mathematical operations faster and more reliably than any human, but is not in the relevant sense intelligent at all.

    Anon Engineer seems to be addressing 2), and there certainly are dangers there. But these have little to with LLMs or AI per se.

    I am particularly worried about 3): If people mistakenly believe that computers are intellengent in the sense that humans (or even cats!) are, they might entrust them to do things that a normal human could be entrusted with, and then screw up in completely unforeseeable ways because in fact they aren't operating at all in the way humans do.

  4. What concerns me the most is that LLMs deceive even highly educated people (such as biomedical researchers, who tend to be temperamentally inclined toward technophilia) into thinking that they can safely bypass the process of thinking for themselves on paper, a.k.a. writing and revising. As someone who has edited and taught writing to biomedical scientists for over 30 years, I'm deeply disturbed by the number of graduate students, postdoctoral fellows, and even young faculty who use LLMs to come up with recommendation letters, draft their papers, outline their talks, etc. It’s not just that the LLMs produce text that, at best, is banal and easy to distinguish from the refined product of an attentive human mind. The deeper issue is that the process of writing is largely also the act of _thinking_: the fingers do not simply take dictation from the head, nor does the mind merely search some sort of neuronal database of words to find ones to label pre-existing ideas or inchoate feelings. The labor of figuring out the right word, the right metaphor, shapes what we think and feel, which is why writing can be such an iterative and laborious process. This process of articulation deepens our attention and constitutes our interpretation of the matter at hand, our ability to understand it. Moreover, since we usually articulate for the sake of communicating with someone else (even our own future selves), the act creates an ethical demand for attention, precision, and truthfulness. By outsourcing to an LLM the placing of words onto a screen–I cannot, and we should not, call this process writing–we remove opportunities to deepen our own understanding of our thoughts and impressions. We remove ourselves, in fact, from what we produce. I, for one, do not want to live in a world in which everything, including 'thoughts', are mass-produced.

  5. I would add to Tim Maudlin's list:

    (4) Does AI (or more particularly LLM's) pose dangers because employers will find ways to get more tasks done by AI and hire drastically fewer humans, thereby causing great economic disruption and hardship?

    As someone who's fairly knowledgeable about these issues, I'd put a lot more money on this danger turning out to harm a lot of people in the near future than most of the ones Tim mentioned (though it's hard to know what all to count as his Danger #2, bad actors using AI for bad purposes).

    In this context, I think intelligence comparisons between AI and cats are laughably out of place. LLM's and cats do completely different things, so it's ridiculous to compare them. Cats and Google Maps do do somewhat similar things, like pathfinding between waypoints, and, while cats have some clear advantages on very small scales, I think Google Maps wins hands down at finding you a good route from NYC to LA, and Google Maps is a threat to the livelihood of traditional atlas-makers in a way that cats could never dream to be!

  6. Cameron Domenico Kirk-Giannini

    I think GPT-4o is probably smarter than me. I think I am smarter than a cat. So, I think GPT-4o is probably smarter than a cat.

    I think GPT-4o could probably strategically outcompete me in most games, especially if it was given the opportunity to learn to play as it went along in the same way a human can.

    The fact that AI systems that already exist could probably strategically outcompete me in most games has me worried, since the AI systems which will exist in two or three years will be even more capable.

  7. I fully agree with Justin Fisher, and would have added that as 4) if I could have edited the post. Of course, 4) is just as common and usual as 2) with respect to any new technology: the Industrial Revolution put a lot of people out of work, and self-driving taxis will as well without having any more evil intent than to make money.

    Confronting that issue, therefore, involves confronting a general and long-standing issue of how society ought to absorb any technological change, and how it ought to provide for those who lose jobs because of it. But that issue really has nothing in particular to do with AI and intelligence at all: in the sense we are concerned with, the mechanical loom is not an intelligent item, just an intelligently engineered one.

  8. When we think about AI 'intelligence', the standard approach is to think through a bunch of different capabilities, and try to develop benchmarks for each one that assess how well AI systems perform on that dimension. 
    Here is an unordered and unexhaustive list of some of the capabilities I think are especially relevant to AI emerging as a systematic competitor to humanity:

    -General reasoning ability
    -Mathematical reasoning 
    -Knowledge of different subject matters 
    -Ability to use game theory / theory of mind
    -Deception
    -Tool use
    -Scientific research
    -Coding/hacking
    -Success in game environments
    -forecasting-Agency / goal oriented behavior

    To assess mathematical reasoning, you can look at GPT 01's high performance on a range of math benchmarks here: https://openai.com/index/learning-to-reason-with-llms/

    For knowledge of different subject matters, a popular benchmark is MMLU, which includes college-level questions about a wide range of subjects.

    For ability to use game theory / theory of mind, one especially compelling example is Cicero, a hybrid of RL and language models that achieved expert level performance in Diplomacy, a board game that involves building coalitions with other players. https://ai.meta.com/research/cicero/ 

    For deception, you can take a look at a recent survey paper on my website that goes through deceptive behavior in a wide range of AI systems including LLMs. 

    For tool use, there's a lot of recent work augmenting LLMs with extra widgets: here is a recent survey https://arxiv.org/abs/2409.18807. 

    For scientific reasoning, one relevant model is ChemCrow, which augments LLMs with chemistry tools and knowledge in order to design novel chemicals. Other studies have recently found that LLMs can match human performance in scientific hypothesis generation.

    For coding/hacking, you might take a look at Devin, a recent LLM-powered software agent that is en route to human level coding. 

    For success in game environments, there are a bunch of studies showing that LLMs can successfully play social deduction games for example. But Cicero is again my favorite example (although it is not just an LLM, also uses lots of RL).

    For forecasting, a series of papers have found that LLMs can match or even outperform human abilities to accurately estimate future events.https://arxiv.org/abs/2402.18563  

    I think agency / goal oriented behavior / planning is the main thing holding back LLMs from matching human power levels. One good benchmark here is AgentBench 

    As an exercise, you can also test the accuracy of your instincts about LLM capabilities here, with this interactive quiz: https://theaidigest.org/ai-can-or-cant

    LeCun's skepticism about LLMs is a hodgepodge of different challenges, which Ben Levinstein and I respond to here https://arxiv.org/abs/2407.11015. One claim is that LLMs don't have "world models"; we think that there are lots of examples of LLM world models. Another claim is that LLMs merely memorize; this is standardly ruled out by reserving a portion of the data set to test on. Another claim is that LLMs make all sorts of vivid mistakes when they reason. Here, we'll want to test competence/performance distinctions, and make sure that we aren't creating a criterion for rationality so demanding that humans also fail it. 

  9. Siddharth Muthukrishnan

    So here are a set of skills that ChatGPT has:

    -Given a text in almost any major language, it can produce a passable translation.
    -Given a description of a well-defined problem and a programming language, it can generate a reasonable first-pass chunk of code in that language to solve that problem.
    -Given a description of a scenario, it can generate an image that at least somewhat represents that scenario.
    -Given a mathematical problem of moderate hardness, it can provide a serious first-pass attempt at the problem.
    -Given a document, it can provide a decent first-pass summary of that document.
    -Given a sentence, it can identify grammatical and readability issues.
    -Given a historical episode, it can provide a decent narrative summary of that episode and the relevant context.
    -Given a list of symptoms of a patient, it can provide a decent first-pass diagnosis of that patient.
    -Given a description of a chess position, it can provide a decent next move/continuation.
    -and so on…

    I don't have a cat, but I don't think any cat can do any of these. Indeed, while ChatGPT is not as good as well-trained and well-qualified humans at any one of these tasks, there are very few (no?) humans who are as good as ChatGPT at *all* of these tasks. So even if one doesn't want to call such models "intelligent", it's very hard to deny that they have capabilities that animals (human or non-human) don't have.

    Now, there are bunch of things a cat can do that ChatGPT can't, such as: Landing gracefully on one's feet, stalking a bird, navigating a room full of obstacles, smelling food, and so on. A cat can do so using much lower amounts of energy and training data. But that just means it has a different set of capabilities, one more adapted to the environments it finds itself in, and not that it is more "intelligent".

    It is clarifying to think in terms of *capabilities* instead of the nebulous "intelligence".

    And the key thing with AI models is that they are acquiring capabilities that are very valuable in human societies (e.g., serious coders find LLMs transformative to how they code and debug) and they are continuously improving at those capabilities. And no matter how it all plays out it seems absurd to think that cheaply and easily providing capabilities that make a big difference to human societies won't have a serious impact and that at least some of that impact won't be negative. For instance here are some plausible downside scenarios:

    -Malicious hackers find it much easier to find exploits on critical infrastructure using LLMs.
    -AI generated spam skyrockets and filters can't keep up.
    -Mass layoffs as it's cheaper to have a few people running things with the help of AIs.
    -Malicious actors find it easier to design bioweapons with the help of AIs.
    -and so on…

    The upsides might outweigh downsides; after all, perhaps scientists can use AIs to design better drugs or find a way to take CO2 out of the atmosphere. But again it seems silly to deny that there won't be hazards along the way.

  10. For a long time, there has been a distinction in AI between general artificial intelligence and narrow artificial intelligence. GPT-4o might be a better narrow artificial intelligence than most people on tasks that can be reduced to language production. LeCun's point here is that a cat is probably a more generally intelligent agent than an LLM, since it has a full suite of behavioral, perceptual, motivational, and cognitive systems that an LLM lacks, and so can solve a lot of problems which are not easily reduced to language production. LeCun for a while has been advocating for a more explicitly modular approach to agent architecture, with numerous subsystems specializing in different parts of a fully functional agent, whereas LLMs until recently have focused almost entirely on the transformer architecture trained on next token prediction. They are in this sense more like language savants that can do nothing else.

    That said, the last year or two of progress has mostly been achieved by adding additional architectural innovations to introduce more subsystems to the LLM-based system architecture. GPT-4 for example is multi-modal, being trained on a combination of text and images. GPT-4o adds a private internal "chain of thought" scratchpad meant to model aspects of inner speech in human reasoning. Many of the other gains between GPT-3 and GPT-4 were achieved by complicating the training regime–instead of just massive amounts of pre-training on large subsets of the Internet, there are also now instruction tuning, training on huge amounts of code from GitHub, and most importantly Reinforcement Learning from Human Feedback (see here for a summary of how complex this was before even GPT-4o (https://yaofu.notion.site/How-does-GPT-Obtain-its-Ability-Tracing-Emergent-Abilities-of-Language-Models-to-their-Sources-b9a57ac0fcf74f30a1ab9e3e36fa1dc1 ). GPT-4o is further trained in additional rounds of reinforcement learning to use its internal chain-of-thought in an adaptive manner to produce higher quality (more accurate and more coherent) intermediate steps of justification in arriving at its conclusions.

    I think the big question is how much general intelligence can be reduced to operations that can be computed in language space. I think the dominant view in philosophy of cognitive science here recommends a pluralistic attitude: we can often solve the same problem in distinct ways. Language is a very powerful representational medium and a great deal of knowledge about the world is captured in some form in patterns of language (if this weren't true, vast amounts of human cultural knowledge transmission would be impossible), albeit perhaps with differing degrees of efficiency and accuracy (it would be pretty hard, perhaps in principle impossible, to learn to play tennis entirely in language space). For example, I can count by moving my fingers, by following things with my eyes, or by reciting number words (the difficult unresolved debates here are rather what is required to acquire a human-like ability to count). I and co-authors have argued in a couple of places that it is a mistake to conclude that because the training objective of LLMs is just next-token-prediction that they can't acquire other intermediate goals or objectives in service of next-token-prediction (Andreas makes an interesting conceptual argument for this position here: https://arxiv.org/abs/2212.01681 ). For example, LLM-based systems might learn to produce text that causes humans to respond in ways that make the resulting discussion more predictable. If prompted to produce text consistent with a certain objective (making a user happy, executing a successful phishing attack), it could acquire an intermediate objective to produce text that manipulates the human into making that objective more likely (saying words indicating happiness, producing a password).

    That all said, in some sense these systems are still beholden to whatever is in their prompts, which is what sets the transformer in motion. LeCun often notes that if you're afraid of what these systems will do, just don't prompt them, and they will sit there idle forever (a cat, by contrast, will become increasingly obnoxious until you feed them). There are some systems with private long-term goals and internal episodic memory buffers, but most of these systems are only responding to probability calculations based on whatever is in their current external (and now internal) prompt. They almost all lack diverse independent learning processes and motivational goals (e.g. cats, as we all know, are motivated by hunger, fear, comfort, and, occasionally, affection), and few if any model affect or emotion in any serious sense. In that sense, LLM-based systems are all less generally much less potent and independent agents than a cat (who as I am sure we are all know, never forget a slight), and generally under the control of their programmers. Erase your insult from the prompt of GPT-4o, by contrast, and all is forgiven. That leaves the dangers mostly, for now, with Maudlin's #2 and #3 above (for which the economy of scale gained by automation might be a very serious threat indeed), which too much speculative focus on his #1 might obscure…

    FWIW my own position is that we still need to understand much more about the internal organization of trained up LLMs before we can competently assess the risks. In my opinion, there is way too much napkin math in this area of research (what's your p(doom)?!). Progress will instead be best achieved with mechanistic interpretability work which probes the internal organization of trained LLMs using methods adequate to the best standards in philosophy of science to discover what, if any, representations, world models, and intermediate goals these systems' training regimes have introduced. In short, we have to actually figure out how LLMs work, which, we should be embarrassed as a species to admit, we really still don't know despite having successfully built them. This leaves me slightly more concerned than LeCun–I get nervous, for example, when I read in an exploratory paper that someone granted a non-nerfed version of GPT-4 active access to the Internet and ability to execute Python code–but quite far from being as worried as most people working in X-risk.

  11. "Another claim is that LLMs make all sorts of vivid mistakes when they reason."

    They certainly make all sorts of really vivid and surprising mistakes! Mistakes that no even mildly intelligent human being would make. In The Emperor's New Mind, Penrose gives an example of a powerful chess program making an obvious blunder that even a duffer like me would never make, because the bad outcome lay beyond the program's search horizon. And that's because—no matter how well it plays or if you fix the problem by extending the search horizon—the program just isn't operating at all the way any human chess player does, and LLMs don't operate at all they way human language-users do. That's obvious enough because children learn languages with nothing like the sort of data-base of examples that LLMs are trained on. That's why LLMs don't react differently to prompts in different languages, while humans can respond to questions in a best a few. Because the LLMs just are not doing language processing at all in the sense that humans do. They don't "learn languages", they just get more text thrown into the training data.

    If all you care about is the *performance*, then you don't care: the chess-playing program, or go-playing program beats the human, and it doesn't matter how it does it. The novel is gripping even if was typed out by a bunch of monkeys. That's what you have in mind by "capabilities": nothing more or less than what is the output. But when you write: "Another claim is that LLMs make all sorts of vivid mistakes when they reason.", note well that you are using the word "reason", and that is a key term. Like any such term, it was introduced to refer to certain things *humans* do. Not to the output, which is only *evidence* of reasoning, but to the *process that creates the output*, which *is* reasoning. The point of LLMs is that they are good at producing output that looks as if it were produced by a process of reasoning when it absolutely was not, and the "vivid mistakes" prove that. Putting in patches to avoid the vivid mistake without completely changing the architecture would not solve the problem but merely cover it up. LLMs make these mistakes—which no human would make—because they are simply not reasoning. They are running a statistical algorithm on a data-base of uninterpreted texts, and that just is not anything at all like what humans do. Since "reasoning" is a term whose very meaning is rooted in certain things humans do, they aren 't reasoning at all. That's LeCun's point, and it is not about output. It is about process.

    LLMs have no intentions, or desires, or motivations, or goals. Cats do. Cats are much, much closer to humans in terms of reasoning than LLMs, not least because LLMs are at zero on that scale.

    As for "creating a criterion for rationality so demanding that humans also fail it", no acceptable criterion could possibly do that, since "rationality" is defined in terms of what humans do. The point about the vivid mistakes isn't that we are creating a demanding criterion ("No rational being ever makes mistakes"), but that the *nature* of the errors prove that the *process* underlying the behavior of the LLM is just completely unlike what humans do, and hence just isn't reasoning.

  12. To push back a little in defense of what Simon has said:

    As I noted above, I don't think we can yet be sure that LLMs have no other objectives than next-word prediction. It also can't be settled by whether at bottom they just do a bunch of linear algebra calculations or statistical computations–at bottom, we are just a bunch of neural firings. What we need to figure out is what they (and we) can implement with linear algebra (or neural firings). They might, for example, acquire intermediate objectives in service of better next-word prediction, implemented in linear algebra operations. Those objectives won't have the same kind of robustness as the objectives as a cat (as I noted in my comment above), but they could be pursued with varying degrees of flexibility and sophistication within the context of a dialogue. For example, there is the famous anecdote of GPT-4 being tasked to recruit humans to solve CAPTCHAs, and then when prompted by a human user to explain why it needed help solving the CAPTCHA, it responded that it was a visually-impaired human. Deciding whether it has obtained intermediate goals to solve the CAPTCHAs and deceive the human user on its intentions can't be settled just by pointing to the training objective of next-word prediction, but rather requires more systematic interpretability work and philosophical theory.

    I also think it's more challenging that you suggest to come up with a criterion of rationality here that is not so demanding that it disqualifies most humans. Humans also sometimes commit howling errors too (e.g. as the heuristics and bias literature documented so thoroughly), and it's hard to say which errors are disqualifying and which are not. We also want to be wary of being explicitly or unduly anthropocentric in whatever criterion we adopt. There is a long tradition of worrying about these issues in the case of animal cognition (going all the way at least back to Hume); I've argued in the past that we have repeatedly in animal cognition research done exactly that, endorsed criteria for rationality which do in fact disqualify most humans (I've pressed this against Povinelli in primatology, for example). We need to say something more than "exactly the errors we make and no others", which is normatively inadequate and furthermore provides no guidance for future tests. We might say that some of the errors are so strange and ridiculous, such as those found in adversarial attacks, that we can't take seriously that humans commit them. But adversarial attacks are generated by new and poorly-understood machine learning methods, and frankly nobody has ever checked whether humans are vulnerable to them until recently. (On that, see this https://proceedings.neurips.cc/paper/2018/hash/8562ae5e286544710b2e7ebe9858833b-Abstract.html and this https://www.nature.com/articles/s41467-019-08931-6 )

    I discuss these issues in the context of animal cognition here:
    "Morgan's Canon, Meet Hume's Dictum: Avoiding Anthropofabulation in Cross-Species Comparisons" (I should have called it his "touchstone", but…folly of youth)
    https://link.springer.com/article/10.1007/s10539-013-9376-0

    And apply it to recent discussions on AI evaluation here:
    "Black Boxes, or Unflattering Mirrors: Comparative Bias in the Science of Machine Behaviour"
    https://www.journals.uchicago.edu/doi/10.1086/714960

    Comparative bias of this sort has been a huge obstacle to progress in animal cognition research, and I've been urging everyone in this space to learn from the difficulties there so as to not recapitulate them in the newly-sophisticated evaluations of AI behavior.

  13. It's common to find that DNN-based systems are vulnerable to really trivial modifications of their input, which might lead one to conclude that they just couldn't possibly be implementing anything like human reasoning or understanding. Indeed, LLMs are vulnerable to adversarial attacks, some of which only involve trivial changes to punctuation. One of my favorite examples for rebuttal here involves psychological research showing that humans are surprisingly vulnerable to trivial changes in spacing when doing simple arithmetic problems. Humans adopt heuristics to group symbols by mathematically-irrelevant features–symbol spacing–to compute order of operations. Do we conclude these humans aren't reasoning or can't actually do arithmetic? Or are some heuristics acceptable and others not? If some heuristics which focus on mathematically-irrelevant features are acceptable, how do we separate the acceptable from the disqualifying?

    "Proximity and Precedence in Arithmetic" by David Landy and Rob Goldstone
    https://journals.sagepub.com/doi/abs/10.1080/17470211003787619

  14. Cybersecurity professional here (with backgrounds in philosophy and logic, including unconventional computing that covered neural network approaches). New kinds of vulnerabilities in these sorts of systems are now being discovered. We also are now confronting the 1980s debates (and earlier) over eliminative materialism, the language of thought, etc. which are (were) engaged in by (amongst others) philosophers of mind and science. This has profound implications for how one does security – IMO if Paul Churchland was right about what ANNs are about, we'll *never* be able to secure them as well as we can in principle secure programs/software as ordinarily understood. Tim Maudlin's taxonomy puts some of this under 2, 3 but there is also what I would call a phase change here – if I may use the physics metaphor. I would also seriously dispute some of Siddharth Muthukrishna's list – the first item is definitely NOT true in general, especially if one adopts the principle (as IMO one should) that security should be a first class value (so to say) and not an afterthought. Gary Marcus (echoing Dan Dennett) has also pointed out the problems with some aspects of natural language processing – the systems still do not handle logical operators very well, especially negation. An exercise to the experimental philosophers interested in logic – take your favourite LLM and try to figure out if it "embodies" a logic. I do not regard it at all clear as to what to conclude – and it has interesting echoes in intelligence testing used in forensic clinical psychology, for example. Another area where merely understanding it is a challenge – and hence a place of risk.

    I also think that the seemingly hyperbolic (terminator worries, etc.) are useful to help adjust expectations – but also "black swan" events are worth thinking about.

  15. " I don't think we can yet be sure that LLMs have no other objectives than next-word prediction"

    We aren't communicating here. LLMs—the actual computers running the programs—have no objectives at all. They have no interest in doing anything and are not attempting to accomplish any goal since they have no goals and no interests. Hand-held calculators—those very objects—do not have the "objective" of correctly giving the answers to mathematical problems, as if they would be upset if they didn't. They have no objectives.

    Large mainframes used to make predicable mathematical errors due to rounding. It was obvious that the answer was mathematically wrong, and why it was wrong, but the program was running as written. The "objective" of getting the correct answer to a mathematical question belonged to the programmers, not to the machine itself, and the only agents disappointed because the output "wasn't what they wanted" were the programmers, not the computer itself. The computer had no aims or objectives or desires. So this issue isn't that the computer only has this objective and not some other: it is that the computer in such a case has no objectives at all. Ascribing it any objective is incorrectly anthropomorphizing it, adopting what Dennett called "the intensional stance", treating it *as if* it had intentions and aims and goals and was trying to reason out how to achieve them when it isn't doing any such thing.

    Adopting such a stance toward a computer like a chess-playing computer can often provide a quite good and quick way to predict what it will do: it does in many ways act *as if* it had intensions and goals and so on. But once you understand how it works, you see that it just doesn't work that way at all. And that is also demonstrated by some of the bizarre (for an intelligent agent) errors it makes.

    In his wonderful novella "The Royal Game", Stefan Zweig describes a character who is driven insane because he is forced (in solitary confinement) to try to play real games of chess against himself. It drives him insane because a game of chess played by real human chess players requires that each not know what the other is up to: what the strategy is and what traps are being set. So to really play against oneself requires a sort of induced schizophrenia: somehow, "one part of your brain" has to be blind to what "the other part" is thinking.

    But getting a chess-playing computer to play against itself is trivial: it is just running an algorithm with no "plans" or "traps" or whatever, and you can just take whatever move it outputs for white and plug it right back in, and let it run the same algorithm to spit out the next move for black. It is just doing a completely different sort of thing at the fundamental level than a human chess player (not to mention evaluating billions of possible continuations when a human only considers a handful).

    LLMs only do next-word prediction because that's all they are designed to do. And more than that: as is well-known, if you program them to pick the next-word prediction that is rated as statistically the most likely by the algorithm, then the output both does not sound natural and gets caught in loops. So in the programs, the statistically most likely continuation is not always chosen. How often do you choose a more statistically unlikely continuation, and how far down the list do you go? And how many levels of transformers to you put in the network to get natural-sounding output? As I understand it, this is all done by trial and error. No one really understands why certain choices for these parameters yield natural-sounding output and others yield obviously not-naturally-sounding output, but they have discovered some numbers that work well. But again: that just does not correspond at any level to how humans produce verbal output. It makes for a good imitation, but it ain't the real thing.

  16. Sure: humans make dumb mistakes, even predictably so. And those dumb mistakes certainly provide clues to the actual reasoning processes involved. Optical illusions, in a sense, depend on "mistakes" of the optical processing system, in that in some cases the representation produced doesn't correspond to reality (in the Müller-Lyer illusion, two lines that are in fact of equal lengths look like they aren't, for example). All that is fine.

    My point is that human cognitive processes—whatever they are and whatever their vulnerabilities—provide the standard or paradigm processes for the term "rational". "Reasoning" is introduced as a term denoting whatever it is that humans do in certain paradigmatic cases, just as certain paradigmatic actual samples of "water" play a central role in determining the truth conditions for the term "water" (see: Twin Earth). Just as XYZ on Twin Earth isn't water, even if it superficially appears like water, so what LLMs doing isn't reasoning, even if the output superficially looks like the output of reasoning.

    How do you tell whether computers or animals (or bizarrely-behaving humans) really are reasoning? It's just like with "water": first you have to figure out how the paradigm examples (in this case normal human reasoning) works, and then how the new system (animal or computer) works, and then see how analogous they are. We don't know all that much about the actual structure of human reasoning, and even less about animals. But we sure know a lot about how computers work, because we built them, and in some cases know for sure they don't work at all like human cognition!

    So no: we can never find out that humans "don't really reason" on the same grounds that we can never find out that the paradigm instances of water "aren't really water at all": that's because of the role the paradigms play in the determining the semantics of the term. You can introduce some new term with a criterion of application so rigorous that even humans can't satisfy it, but then (it seems to me) you are changing the subject. Analogously, one can introduce the term "pure water" for samples that are all and only H2O, and then claim that the paradigm instances (rain and drinking water and so on) "aren't really water", but that strikes me as a misleading way to speak. You can say truly that they are "pure water" in the sense you have just stipulated. And then you can find out (which is true) that drinking pure water can be lethal (due to the lack of minerals, it causes minerals to be leached out of cells, eventually killing them). But to report this result as "It turns out that water is lethal" strikes me as badly and pointlessly misleading. It just isn't true for what we had in mind by "water", although it is true for "pure water".

  17. Sorry: You can say truly that the paradigm instances of water are *not* pure water as defined.

  18. I just want to point out LeCun's utter calumny against cats that is on display here.

  19. Tim,
    1. The humanlikeness of LLM language use is an active and interesting research area. Here is one recent paper: https://arxiv.org/abs/2311.07484. Questions about humanlikeness of LLM language use are somewhat orthogonal to whether LLMs reason. 
    2. You made similar remarks about LLM reasoning in a facebook exchange with Ben Levinstein last year.  Ben and I wrote this paper in response to your remarks: https://arxiv.org/abs/2407.11015. Among other things, we argue that recent work in AI interpretability suggest that LLMs do not merely perform statistical pattern matching, but also build models of the world that they use to reason. To advance this debate, I'd like to hear your response to what we say specifically there. In addition, I'd like to see you formulate some necessary conditions on reasoning, explain why LLMs fail to satisfy these necessary conditions, explain how various vivid failures of LLM performance are evidence that LLMs satisfy the necessary conditions, and explain why you think humans satisfy these conditions (for example, predictive processing theory and its ilk suggest that humans also perform lots of statistical pattern matching). 

  20. Well, we could find that the paradigm examples of water aren't really water at all, at least in the sense that we found out the paradigm examples of witches aren't really witches. I'm not worried about 'water' getting eliminated; and I agree with you that folk mental terms are flexible enough that they're not likely to be eliminated for the reasons some favored in heyday of eliminativism…but it's a coherent possibility. I think the point that they were making above is that we shouldn't define "rationality" or "reasoning" in a way such that it goes the way of witches, failing to apply even to paradigm examples. The challenge then is to actually define "reasoning" and "rationality" so that they apply to humans, fairly assessed. And since you concede we don't really know how humans reason, it seems difficult to conclude with much confidence at present that recent LLMs can't, since we don't have the theory that would tell us which errors are disqualifying and which aren't.

    I don't actually think it likely that current LLMs are *reasoning*, full stop. For now, it's probably more useful to evaluate the degree to which they are good "models" of reasoning. But I don't think the concerns adduced above should give us confidence they aren't doing anything like reasoning, either. We have to actually figure out how they work internally, how they're representing the world (I don't think it's a mistake to talk that about representation here–there are perfectly sensible theories of representation that can be applied to these networks, e.g. https://onlinelibrary.wiley.com/doi/full/10.1111/j.1468-0017.2007.00308.x ).

    As for "objective", ok, I was using it in the same thin sense that CS researchers talk about training objectives. I don't normally use "objective" as a particularly loaded mental word, but some words should be reserved as loaded in that way, whether it's "goal", or "desire", or whatever. I do think there are more interestingly agent-like things going on in LLMs than in calculators. I'll give you that I'm uncomfortable talking about these systems really having desires, but that's because I would endorse a naturalized theory of desires that would take them to be functionally more complex than components of current LLM-based systems (which might hold of near successor systems). I don't relish the attempt to describe what they do without using some word like 'objective', but the point is that if they very flexibly and powerfully implement next-word prediction across a wide range of situations, they can, in service of that, flexibly and powerfully converge on a wide variety of other outcomes that we could readily describe in more abstract folk terms. In service of doing what they were designed to do, they can do other things too, things that we can't anticipate, because again we don't have a very good theory of how they work, yet, either. (We could also try to forbid the words 'training' or 'learning' as too normative or teleological too–I guess some people argue exactly that–http://kryten.mm.rpi.edu/SB_NSG_SB_JH_DoMachine-LearningMachinesLearn_preprint.pdf–but soon we would have difficulty describing what these systems reliably do at all, without just writing some math equations and working out tons and tons of implications any time we wanted to figure out what they do. We could similarly treat 'calculate' as normatively loaded and end up saying that calculators don't calculate…it has to stop somewhere.)

  21. I don't want to ping-pong this forever, but just a couple small points.

    Natural language terms can be both natural kind terms like "water", which are introduced with no real theoretical commitments about the explanatory structure, or theoretical terms, like "witch", which is bound to some theoretical commitments (witches have to have supernatural powers). Some terms, like "polywater", have aspects of both: when the term was introduced there were some paradigm samples ("These are polywater.") and some theoretical commitments ("Polywater is an anomalous form of H2O."). At the end of the day, these two aspect came apart: the actual samples weren't anomalous forms of H2O, they were rather badly contaminated samples of water. And in the aftermath, both forms were used, i.e. "It turns out that there is no polywater." (bound to the theoretical description) and "It turns out that polywater is just dirty water." (bound to the samples). Both are fine, and the principle of charity tells you how to interpret what was meant.

    But I don't think that it can be in the cards to ever accept "It turns out that humans don't have beliefs" or "It turns out that humans never reason". There isn't a salient theory so tightly bound to the meaning that failing to satisfy the theory would imply there are no such things. Rather, it is the theory that would be wrong.

  22. I think it's worth distinguishing two different questions:

    1) Do LLMs display human-like intelligence?
    2) Do LLMs display human-level intelligence?

    I think I agree with Tim Maudlin that LLMs are cognitively alien in many ways. Moreover, precisely because they were trained to resemble us, it's easy to miss just how cognitively alien they are. You get glimpses of their alien nature when you realize that they can't count the number of letters in a word—or when you realize that they're capable of doing semantic proofs in modal logic, but incapable of proving "A>A" in a Fitch-style proof system. These kinds of examples convince me that whatever's going on with LLMs under the hood has got to be pretty different from what's going on with us under the hood. So I think that the answer to (1) is "no, not really".

    But I think that the answer to (2) is very likely "yes". These machines are capable giving correct and insightful answers to novel questions in intelligent ways. They're capable of correctly solving novel problems that stump my undergraduates. If Terrance Tao can give chatGPT-o a new math problem and estimate that it solves it at the level of a math graduate student, then I think it's got to be pretty smart. It hasn't just memorized the answer elsewhere on the internet. It's not just parroting stored information. Whether it's reasoning about the problem in anything like the way that we reason is another question. It's not even clear to me that what it's doing counts as reasoning. But if something's capable of out-performing most humans at a wide variety of novel tasks that it wasn't specifically built to perform, then I have a hard time avoiding the conclusion that it's intelligent, even if it's not at all intelligent in the way that we are.

  23. I don't think it can be in the cards to accept "It turns out that humans don't have beliefs", but it absolutely could be in the cards to accept "the basic picture of belief that philosophers have accepted, on which there's a binary state including some representation of a propositional content, doesn't correspond closely to anything going on in a more thorough understanding of the human mind".

  24. I think the two lines of argument come to the same thing: don't adopt a theory of rationality so restrictive that it fails to apply to average humans (this is, as I interpret it, basically what Hume's touchstone says). The challenge for skeptics of the degree of similarity between the processing that occurs in DNNs and cognition in the human mind is to provide a theory of rationality that disqualifies DNNs but does not disqualify average humans. Many of the theories of rationality that end up endorsing conclusions like "they don't even have objectives", "all they do is matrix operations" can only do so by adopting such restrictive criteria of rationality that they fail to apply to average humans. The Bringsjord et al. paper that I linked above, for example, does just that: it argues that machine learning agents aren't capable of genuine learning because genuine learning requires hypercomputation. I think it's very reasonable to suppose that hypercomputation is impossible, even in human cognition; so this would end up being an eliminativist theory of human learning.

    So, if we agree that adopting such a restrictive theory is a bad idea, we skeptics need to outline an alternate theory of rationality, at least in sharp enough strokes to provide confidence that average humans clear the bar, but DNNs don't. I've argued that this is more challenging than it initially seems, at least without being unjustifiably anthropocentric as to which errors are permissible and which are not.

  25. Idk if BL wants to allow questions from the uninformed, the answers to which may be very informative. If he does, here goes:

    What does it actually mean to say that an LLM makes an out-of-sample 'next-token prediction' in the course of *generating* text? Suppose the given string so far is "My very favorite color is", and then it 'predicts' 'blue' to be the next token (or whatever first part of 'blue' is an actual token). What is the criterion for success here? Perhaps it is whether 'blue' is (roughly) the most likely next token in the sample data (which I suppose includes but is not exhausted by the training data). But now consider we powering a chatbot by an LLM, and the chatbot is now generating this text when prompted by the user somehow. What *then* is the criterion for success?

    I just can't get on board with these debates about reasoning, objectives, whatever until somebody can help me bridge this gap. It seems to me those debates have no application until we're looking at a use case of LLMs — like in a chatbot situation. And yet I have no clue how to make sense of what it is doing in that use context simply with the concept of 'next-token prediction'.

  26. It might help to distinguish three different stages in the life-cycle of a large language model, or a chatbot (a large language model is just a particular kind of chatbot). In infancy, an LLM mimics humans by playing a 'guess the next word' game. In middle life, it goes to finishing school and learns to produce novel texts that humans approve of. And in later life, it leaves school and goes to work producing new texts.

    In its infancy, an LLM is fed some text and asked to output a word (in fact, it's not *words* that it outputs, but rather *tokens*, which are sub-strings of words, but I'll ignore that). The parameters of the model are then adjusted in response to whether the word it outputted was in fact the next word in the training text. Roughly the idea is that the parameters of the model are tuned so that over time the model's outputs get closer and closer to the texts it is trained on. (The training data is basically all the text you can find online.)

    In its middle life, the LLM goes off to finishing school. Here, it doesn't play the 'guess the next word' game on pre-existing text. Instead, it still outputs words, but the text it is fed is your input and its own earlier outputs. So its output is produced word-by-word. First, it sees your input and outputs a single next word, like it was trained to do in infancy. Then, it sees your input and the first word of its response, and it outputs a new word. Then, it sees your input and the first two words of its response, and it outputs a third word, and so on. At finishing school, once it's produced a whole block of text, humans rate this output. And its parameters of the model are further fine-tuned to produce the kind of text that humans like. (Finishing school is called 'reinforcement learning with human feedback, or RLHF.)

    In later life, once its parameters have been tuned, an LLM is *deployed* and *used*.

    The chatbot is doing the same thing in each stage of life. But because in infancy it is being trained on some pre-existing text and its parameters are being adjusted based on how well it does matching that text, its natural to call the outputs in infancy "predictions" about how the text will conclude. In finishing school and in later life, there's no pre-existing text to make predictions about, but because of how it was trained, you could still think of the LLM as making a prediction about what the next word of its own output will be (though, here, there's no external standard of correctness). That's why people call it a 'prediction'.

  27. Some thoughts:

    For me the elephant in the room of the discussion of whether current AI technologies possess reason or act rationally is consciousness: personally I would say that a non-conscious system cannot possess reason or act rationally; it can merely simulate these things. And to be honest they can't really play chess or go either, as this involves knowing that that is what one us doing and non-conscious systems cannot know things; AIs merely perform computational processes isomorphic to playing chess etc. And while I'm not advocating any woo-woo spooky theory of consciousness, I see no actual evidence (though many assertions) that it is a computational phenomenon or emergent from such, as opposed to, say, related to some specific chemical property of nervous systems and brains, and therefore no reason to believe that more powerful AI technologies on current lines are even moving in the direction of being conscious.

    All of which debate, while fascinating, is completely orthogonal to the questions of the economic and social impacts of AI and its dangers: AI systems given control of decisions or weapons may well do awful things, and their widespread usage may well have baleful consequences, and is probably starting to do so already. But to understand them by analogy with human agents is to misunderstand them. In fact as they become more advanced and powerful they become less and less like any human agent and so the analogy becomes worse!

  28. Early career academic

    I would just like to second Goldstein's request that Tim Maudlin fleshes out his views in detail, perhaps in an article. I find them very convincing and would love to read more, especially his thoughts that regard recent literature (such as that by Goldstein). If anybody knows about posts/discussions of his on AI that I might have missed, any pointer would be highly appreciated.

    —–
    KEYWORDS:
    Primary Blog

Designed with WordPress