INDEX
Explanations
specific terms or words with a numerical value mentioned in the text
New Auto-Interp
Negative Logits
jri
-1.08
âĹ¼
-1.01
iland
-1.00
CI
-0.96
aukee
-0.96
erker
-0.95
GoldMagikarp
-0.92
ulhu
-0.92
_-
-0.91
gaard
-0.91
POSITIVE LOGITS
itself
1.45
ultimate
1.11
oneself
1.06
yourself
1.00
himself
1.00
lessness
1.00
icide
0.99
'
0.99
marked
0.98
mark
0.94
Activations Density 0.698%