INDEX
Explanations
pronouns referring to people mentioned in news articles
New Auto-Interp
Negative Logits
Colonial
-0.64
piring
-0.61
entary
-0.60
exponential
-0.60
abound
-0.59
IB
-0.58
Stats
-0.58
rawdownloadcloneembedreportprint
-0.58
Factory
-0.57
flowing
-0.57
POSITIVE LOGITS
'd
1.28
regretted
1.08
personally
1.06
thinks
1.01
wished
1.01
knew
0.99
zbollah
0.98
believes
0.95
intends
0.95
hadn
0.94
Activations Density 0.221%