INDEX
    Explanations

    Tension and negative emotions

    New Auto-Interp
    Negative Logits
     funny
    -0.06
    -0.06
    -0.06
    서비스
    -0.06
    _unix
    -0.06
     szy
    -0.06
    animate
    -0.06
    しょ
    -0.06
    まった
    -0.06
     suc
    -0.06
    POSITIVE LOGITS
     бу
    0.07
     obsess
    0.07
     achieved
    0.06
    Translation
    0.06
    0.06
     JL
    0.06
    会议
    0.06
     SERIES
    0.06
    ROUT
    0.06
     bern
    0.06
    Act Density 0.005%

    No Known Activations