INDEX
Negative Logits
limp
-0.09
hoses
-0.09
refor
-0.08
нап
-0.08
vanje
-0.08
🏼
-0.08
_condition
-0.08
hik
-0.07
tub
-0.07
curse
-0.07
POSITIVE LOGITS
qe
0.08
poet
0.07
graphite
0.07
Zeb
0.07
betting
0.07
astrolog
0.07
Constitution
0.07
Pike
0.07
analis
0.07
Sahara
0.07
Activations Density 0.011%