INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     surg
    -0.09
    勉强
    -0.08
     ya
    -0.08
     kurz
    -0.07
     worried
    -0.07
    :red
    -0.07
     סרט
    -0.07
    (aux
    -0.07
    +d
    -0.07
    -ng
    -0.07
    POSITIVE LOGITS
    กระท
    0.07
    ANI
    0.07
    ighet
    0.07
    kinson
    0.07
    phants
    0.07
     Hol
    0.06
    Offsets
    0.06
    ,row
    0.06
    Hits
    0.06
     Antoine
    0.06
    Act Density 0.001%

    No Known Activations