INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     баб
    -0.08
     melding
    -0.08
    老太
    -0.08
    жди
    -0.07
    telling
    -0.07
     прор
    -0.07
     батар
    -0.07
     Persönlichkeit
    -0.07
     trustworthy
    -0.07
     этап
    -0.07
    POSITIVE LOGITS
     approx
    0.08
    	K
    0.08
     humorous
    0.08
     approximately
    0.08
     يص
    0.08
    Cham
    0.08
     Guitar
    0.08
    ully
    0.07
    _month
    0.07
    Anthony
    0.07
    Act Density 0.122%

    No Known Activations