INDEX
Explanations
references to political campaigns and events
New Auto-Interp
Negative Logits
olitics
-0.68
pack
-0.64
¶
-0.64
lly
-0.61
llah
-0.60
nat
-0.59
andom
-0.59
wald
-0.59
reason
-0.58
plan
-0.58
POSITIVE LOGITS
which
0.85
featuring
0.75
thereby
0.74
meanwhile
0.71
whose
0.71
Magikarp
0.66
which
0.66
whereby
0.63
precursor
0.62
wherein
0.62
Activations Density 0.519%