INDEX
Negative Logits
(writer
-0.08
�
-0.07
ophile
-0.07
uropean
-0.06
IMAL
-0.06
steer
-0.06
ogeneity
-0.06
odb
-0.06
Registration
-0.06
xử
-0.06
POSITIVE LOGITS
'є
0.07
mistakenly
0.07
PMID
0.07
nephew
0.06
Coins
0.06
Absolutely
0.06
Jake
0.06
্
0.06
unary
0.06
ε
0.06
Activations Density 0.105%