INDEX
Explanations
phrases related to political events and opinions
references to political events and figures
New Auto-Interp
Negative Logits
ulates
-0.67
etermin
-0.67
otin
-0.66
climbs
-0.65
ranges
-0.63
advises
-0.63
presses
-0.63
uces
-0.62
ati
-0.60
muse
-0.60
POSITIVE LOGITS
didn
1.27
Had
1.22
Didn
1.21
Was
1.14
lacked
1.11
Had
1.10
Was
1.09
didn
1.07
hindsight
1.05
did
1.05
Activations Density 1.021%