INDEX
Negative Logits
glitter
-0.08
Egg
-0.07
Destruction
-0.07
Kahn
-0.07
dı
-0.06
either
-0.06
ilated
-0.06
OA
-0.06
Dirt
-0.06
Dict
-0.06
POSITIVE LOGITS
verse
0.10
VERSE
0.07
Test
0.07
620
0.06
Passive
0.06
tiler
0.06
stop
0.06
�
0.06
Resolve
0.06
excessive
0.06
Activations Density 0.002%