INDEX
Explanations
references to individuals, specifically personal pronouns and related terms
New Auto-Interp
Negative Logits
$MESS
-0.17
$LANG
-0.16
hausen
-0.15
Either
-0.15
nor
-0.14
CS
-0.14
streams
-0.14
sami
-0.13
IE
-0.13
.SDK
-0.13
POSITIVE LOGITS
/her
0.65
/she
0.54
.her
0.39
her
0.39
hers
0.34
her
0.31
Her
0.29
/h
0.29
panic
0.28
she
0.28
Activations Density 0.093%