INDEX
Explanations
organization names and abbreviations
New Auto-Interp
Negative Logits
at
0.73
im
0.61
ap
0.61
ون
0.58
های
0.58
ys
0.58
oc
0.57
ческие
0.55
یر
0.55
ce
0.54
POSITIVE LOGITS
นิด
0.57
kritis
0.55
eftersom
0.54
ଡ
0.54
лото
0.53
налич
0.52
үн
0.52
със
0.52
süt
0.51
sozialen
0.51
Activations Density 0.007%