INDEX
Explanations
punctuation marks in the text
New Auto-Interp
Negative Logits
ãĥ¼ãĥķ
-0.16
Dre
-0.15
ãĥ³ãĥĪ
-0.15
Westbrook
-0.14
latter
-0.14
-0.13
ãĥĮ
-0.13
alic
-0.13
éϵ
-0.13
Ham
-0.13
POSITIVE LOGITS
_Lean
0.18
Contents
0.17
anders
0.16
_Tis
0.16
ÐĴики
0.15
ön
0.15
_contents
0.15
exels
0.15
_Pods
0.14
_Parms
0.14
Activations Density 0.147%