INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     cooled
    -0.08
    _other
    -0.08
    _OTHER
    -0.08
     tangent
    -0.08
    cool
    -0.07
     scrolling
    -0.07
    -0.07
    几个
    -0.07
    _A
    -0.07
    Geb
    -0.07
    POSITIVE LOGITS
     risky
    0.08
    asile
    0.08
     liječ
    0.08
     պահանջ
    0.08
     Perspekt
    0.08
     игровых
    0.08
    ілік
    0.08
    דרש
    0.08
     optimise
    0.07
    odni
    0.07
    Act Density 0.001%

    No Known Activations