INDEX
    Explanations

    proper nouns such as names of people, places, and organizations

    references to notable people, places, or organizations

    New Auto-Interp
    Negative Logits
    ¿½
    -0.99
     Ezek
    -0.73
     Mehran
    -0.61
    EMP
    -0.60
     expansive
    -0.60
     Mell
    -0.60
    COMPLE
    -0.59
     nascent
    -0.58
     Naples
    -0.58
    asury
    -0.58
    POSITIVE LOGITS
     sucks
    1.30
     ain
    1.22
    !!!
    1.07
     doesnt
    1.06
    ?!
    1.06
     hates
    1.05
    !!!!
    1.04
    !!
    0.97
    !?
    0.96
     huh
    0.96
    Act Density 0.646%

    No Known Activations