INDEX
    Explanations

    expressions of prevention or resistance against negative outcomes

    New Auto-Interp
    Negative Logits
    enza
    -0.15
     toss
    -0.15
    ibs
    -0.15
    uer
    -0.15
     ÎłÏģÏĮ
    -0.14
    otch
    -0.14
    celik
    -0.14
    openh
    -0.14
     pun
    -0.13
     جÙĪØ§ÙĨ
    -0.13
    POSITIVE LOGITS
    íĻĢ
    0.14
     anymore
    0.14
    .Solid
    0.14
     ä½ı
    0.14
     ÙĥÙĩ
    0.14
    оÑĢÑĥ
    0.14
     à¤ķब
    0.14
    mid
    0.13
    793
    0.13
    mic
    0.13
    Act Density 0.177%

    No Known Activations