INDEX
Explanations
phrases related to individuals in prominent positions or roles
references to political candidates and their standings
New Auto-Interp
Negative Logits
corpus
-0.60
»Ĵ
-0.60
Kard
-0.60
ibly
-0.59
Twain
-0.58
mediation
-0.58
Temporary
-0.58
insk
-0.58
terday
-0.57
Brune
-0.57
POSITIVE LOGITS
runner
1.20
facing
1.13
office
1.10
runners
1.03
running
1.02
eyed
0.99
line
0.98
lining
0.96
eye
0.92
Runner
0.92
Activations Density 0.033%