INDEX
Negative Logits
idda
-0.09
hurdles
-0.08
redevelopment
-0.08
dared
-0.08
immers
-0.08
رم
-0.07
atl
-0.07
নিব
-0.07
র্ত
-0.07
chic
-0.07
POSITIVE LOGITS
поведения
0.12
Verhalten
0.12
behaved
0.12
comportement
0.12
correctness
0.12
behavior
0.12
behaviour
0.12
behave
0.11
আচ
0.11
comportamento
0.11
Activations Density 0.030%