INDEX
    Explanations

    proper nouns related to a specific model mentioned several times in the document

    New Auto-Interp
    Negative Logits
    etheless
    -0.70
     guiActiveUnfocused
    -0.69
    IONS
    -0.67
    ION
    -0.66
     IMAGES
    -0.66
     Gateway
    -0.65
    IBLE
    -0.63
    åħī
    -0.63
    70710
    -0.60
     totality
    -0.58
    POSITIVE LOGITS
    eling
    1.16
    pler
    1.11
    ptic
    1.09
    SPA
    1.07
    lder
    1.06
    bye
    1.02
    ppel
    1.01
    pper
    1.01
    ck
    1.00
    aton
    0.99
    Act Density 0.016%

    No Known Activations