INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     pata
    -0.08
    -0.08
     τη
    -0.07
    add
    -0.07
    pling
    -0.07
    	add
    -0.07
     кең
    -0.07
     compara
    -0.07
    inite
    -0.07
    -0.07
    POSITIVE LOGITS
    মাত্র
    0.09
     pure
    0.09
     lof
    0.08
     чист
    0.08
    uding
    0.08
    -таки
    0.08
     purely
    0.08
     factual
    0.08
     kidding
    0.08
     исключительно
    0.07
    Act Density 0.053%

    No Known Activations