INDEX
Negative Logits
bait
-0.07
alloween
-0.07
Jan
-0.07
Ser
-0.07
Weber
-0.07
utenberg
-0.06
itor
-0.06
managers
-0.06
Gür
-0.06
makt
-0.06
POSITIVE LOGITS
rh
0.08
Rhodes
0.08
imple
0.07
snaží
0.07
ره
0.07
139
0.07
PLE
0.07
RH
0.07
Rh
0.07
Rhe
0.06
Activations Density 0.009%