INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     chocol
    -0.07
     vera
    -0.06
    	direction
    -0.06
     gris
    -0.06
     мала
    -0.06
     Department
    -0.06
    Mart
    -0.06
    =\"%
    -0.06
    (ERROR
    -0.06
     जव
    -0.06
    POSITIVE LOGITS
     playground
    0.11
     Playground
    0.10
     Сред
    0.07
     görev
    0.07
    -kit
    0.07
    charAt
    0.07
    training
    0.07
    .pr
    0.06
    -connected
    0.06
     το
    0.06
    Act Density 0.007%

    No Known Activations