INDEX
Explanations
pronouns and names of individuals
the repetition of pronouns referring to individuals
New Auto-Interp
Negative Logits
ruciating
-0.75
Reviewer
-0.73
Globe
-0.72
opolis
-0.71
Concord
-0.68
Claire
-0.68
ogue
-0.68
Hunting
-0.68
Thrones
-0.66
arios
-0.64
POSITIVE LOGITS
'll
1.30
're
1.13
'd
1.12
ought
1.00
could
0.99
might
0.96
forgot
0.94
qualifies
0.92
SHOULD
0.91
would
0.90
Activations Density 0.218%