INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    _null
    -0.07
    _book
    -0.07
     Crist
    -0.07
     histo
    -0.07
    -centric
    -0.07
    _loc
    -0.06
    _site
    -0.06
    iltr
    -0.06
    ratio
    -0.06
    Susp
    -0.06
    POSITIVE LOGITS
     obey
    0.08
     obedience
    0.07
     obedient
    0.07
    MAN
    0.07
     performed
    0.07
    ิย
    0.07
    YPE
    0.06
    441
    0.06
    Messenger
    0.06
     heed
    0.06
    Act Density 0.007%

    No Known Activations