Skip to content

Why come up with things? New investigators under the hood.

    Finactioning helps to reduce this problem, guide the model to act as a useful assistant and to refuse to complete an prompt complete when the corresponding training data is scarce. That refinement process creates different sets of artificial neurons that can activate researchers when Claude encounters the name of a “known entity” (eg Michael Jordan “) or a” unknown name “(eg” Michael Batkin “) in a prompt.

    A simplified graph that shows how different functions and circuits interact in prompts about sports stars, real and fake.

    A simplified graph that shows how different functions and circuits interact in prompts about sports stars, real and fake.


    Credit: Anthropic

    Activating the “Unknown name” function in the midst of the neurons of an LLM tends to promote an internal “cannot be an answer” circuit in the model, the researchers and encourage writes to give a response to start in the style of “I apologize, but in fact I can answer a question. Characteristics in the Neural Net suggest that it should be.

    That is what happens when the model encounters a well -known term such as “Michael Jordan” in a fast position, which activates that “known entity” function and in turn the neurons in the “cannot answer” circuit “inactive or more weakly active”, the researchers write. As soon as that happens, the model can dive deeper into his graph by Michael Jordan-related functions to give his best gamble to an answer to a question such as “What sport does Michael Jordan play?”

    Recognition versus recall

    Anthropic's research showed that the artificial increase of the neurons of the Neurons in the characteristic “Known Answer” could force Claude to hallucinate information about fully invented athletes such as “Michael Batkin”. That kind of result means that the researchers suggest that “at least a part” of the hallucinations of Claude is related to a “misfire” of the circuit that “cannot answer” path –