INDEX
Negative Logits
�
-0.06
stup
-0.06
旅
-0.06
ців
-0.06
休
-0.06
someone
-0.06
okit
-0.06
евых
-0.06
広
-0.06
χής
-0.06
POSITIVE LOGITS
inflammatory
0.08
/fa
0.07
editorial
0.07
proudly
0.07
reversible
0.06
discovering
0.06
clearly
0.06
activates
0.06
produced
0.06
out
0.06
Activations Density 0.016%