INDEX
Negative Logits
intios
-0.68
المكان
-0.64
fevere
-0.62
&___
-0.61
propOrder
-0.60
ckså
-0.60
fubject
-0.60
Gesch
-0.60
thorny
-0.60
اقتصاد
-0.59
POSITIVE LOGITS
matcher
0.47
woman
0.45
mens
0.41
rator
0.41
board
0.40
<()>
0.39
allus
0.39
ring
0.39
rager
0.38
pend
0.38
Activations Density 0.003%