INDEX
    Explanations

    mentions of the word "lion" along with a high activation value

    repeated mentions of the word "lion."

    New Auto-Interp
    Negative Logits
    mble
    -0.89
    ACTION
    -0.80
    chell
    -0.79
    ilk
    -0.78
    Ñı
    -0.75
    lying
    -0.74
    matter
    -0.69
    ETH
    -0.66
    ÑĮ
    -0.66
    skirts
    -0.65
    POSITIVE LOGITS
    esses
    1.25
    fish
    1.08
    ess
    1.00
    eye
    0.96
     lions
    0.96
    ous
    0.88
    osaurs
    0.85
    odon
    0.84
    doms
    0.84
    iasis
    0.83
    Act Density 0.018%

    No Known Activations