INDEX
Explanations
instances of the pronoun "I" and related acronyms
New Auto-Interp
Negative Logits
rary
-0.17
itorio
-0.17
šak
-0.15
_TestCase
-0.15
anki
-0.14
mony
-0.14
enty
-0.14
haf
-0.14
enny
-0.14
faction
-0.14
POSITIVE LOGITS
994
0.14
PRI
0.14
еÑĤÑĮ
0.14
İli
0.14
Bender
0.14
fod
0.13
;č↵
0.13
wm
0.13
ava
0.13
ippet
0.13
Activations Density 0.038%