INDEX
Explanations
instances of the word "I" indicating personal narratives or reflections
New Auto-Interp
Negative Logits
bian
-0.18
ilion
-0.16
bis
-0.15
alars
-0.15
inals
-0.14
stret
-0.14
ScreenState
-0.14
Inquiry
-0.14
izo
-0.14
ican
-0.14
POSITIVE LOGITS
eless
0.15
Sexe
0.15
ibri
0.15
ivated
0.14
ethyst
0.14
Rouge
0.14
ạp
0.14
otty
0.14
cul
0.14
srv
0.14
Activations Density 0.056%