INDEX
Negative Logits
bo
-0.08
correction
-0.08
umba
-0.07
excit
-0.07
Teatro
-0.07
anj
-0.07
uu
-0.07
cello
-0.07
_order
-0.07
lists
-0.07
POSITIVE LOGITS
clueless
0.09
shrug
0.09
(Client
0.09
̆
0.08
nihil
0.08
shrugged
0.08
ientation
0.08
Client
0.08
pog
0.08
soldi
0.08
Activations Density 0.003%