INDEX
Explanations
instances of the word "I" indicating personal reflections or opinions
New Auto-Interp
Negative Logits
ly
-0.14
enance
-0.14
f
-0.14
lx
-0.14
erialize
-0.14
itself
-0.13
enberg
-0.13
line
-0.13
less
-0.13
IRST
-0.13
POSITIVE LOGITS
've
0.20
'm
0.19
iferay
0.19
’m
0.19
’ve
0.19
'll
0.19
'd
0.18
zzo
0.18
’ll
0.18
’d
0.18
Activations Density 0.453%