INDEX
    Explanations

    The neuron fires on subword pieces of verbs meaning “to believe” (e.g. German glaub- and Romance cred-/creer-) across contexts.

    New Auto-Interp
    Negative Logits
    <Result
    -0.07
     requirements
    -0.07
     [[]
    -0.06
     дек
    -0.06
     Bd
    -0.06
     Phys
    -0.06
    (name
    -0.06
     Kitty
    -0.06
     defiance
    -0.06
    train
    -0.06
    POSITIVE LOGITS
    uant
    0.07
    0.07
    0.07
     Crowley
    0.07
    romium
    0.07
     believe
    0.06
     сто
    0.06
    UDA
    0.06
     existing
    0.06
     Fundamental
    0.06
    Act Density 0.023%

    No Known Activations