INDEX
Negative Logits
τοκ
-0.07
nth
-0.06
�
-0.06
�
-0.06
Wolverine
-0.06
In
-0.06
dale
-0.06
(cors
-0.06
ç
-0.06
"I
-0.06
POSITIVE LOGITS
thought
0.07
referring
0.07
Figure
0.06
curiosity
0.06
Next
0.06
=target
0.06
question
0.06
selecting
0.06
trained
0.06
Ogre
0.06
Activations Density 0.141%