INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ordinated
    -0.07
     например
    -0.07
     ();↵
    -0.07
    utation
    -0.06
     chimpan
    -0.06
     موقعیت
    -0.06
    εια
    -0.06
     whereby
    -0.06
     niece
    -0.06
     привед
    -0.06
    POSITIVE LOGITS
     burns
    0.13
     burned
    0.12
     burning
    0.11
     burnt
    0.08
     scor
    0.08
     burn
    0.08
     Burning
    0.07
     Burnett
    0.07
    hips
    0.07
     Theater
    0.07
    Act Density 0.009%

    No Known Activations