INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Allen
    -0.07
     Gun
    -0.07
     Panther
    -0.07
     Kane
    -0.07
    _GATE
    -0.07
    52
    -0.07
    152
    -0.06
     Gary
    -0.06
     Cannon
    -0.06
    -0.06
    POSITIVE LOGITS
     respect
    0.14
     respected
    0.12
     respects
    0.11
     Respect
    0.11
     respecting
    0.09
    šk
    0.09
     respectable
    0.09
    respect
    0.09
     respectful
    0.09
     disrespect
    0.08
    Act Density 0.024%

    No Known Activations