INDEX
Explanations
phrases related to political involvement and activism
occurrences of the word "em"
New Auto-Interp
Negative Logits
redistributed
-0.70
otherwise
-0.69
SourceFile
-0.62
bed
-0.58
dreaded
-0.56
enriched
-0.56
tipped
-0.56
Kubrick
-0.56
hel
-0.54
heaviest
-0.54
POSITIVE LOGITS
achine
1.23
peror
1.18
issions
1.15
useum
1.12
poral
1.09
erald
1.08
otional
1.08
bourg
1.08
essage
1.06
blance
1.06
Activations Density 0.035%