INDEX
Explanations
mentions of people, pronouns referring to individuals, and possessive forms
New Auto-Interp
Negative Logits
izabeth
-0.76
etheless
-0.68
Articles
-0.65
ACP
-0.63
Dialogue
-0.63
Services
-0.62
Experience
-0.61
change
-0.61
Seym
-0.61
Nichols
-0.61
POSITIVE LOGITS
panic
1.13
Majesty
1.13
/
0.91
uristic
0.90
mos
0.88
ures
0.87
sing
0.86
reditary
0.85
eding
0.82
eded
0.81
Activations Density 14.306%