INDEX
Explanations
mentions of political figures and positions
the word "shadow" and its contextual uses
New Auto-Interp
Negative Logits
urses
-0.91
awaru
-0.77
renheit
-0.76
anchester
-0.75
apsed
-0.74
unker
-0.73
tics
-0.73
keye
-0.71
OPLE
-0.71
Attempts
-0.70
POSITIVE LOGITS
moon
0.98
shadow
0.94
Shadow
0.93
shadow
0.87
loo
0.83
Shadow
0.80
Shadows
0.75
boxing
0.75
dust
0.75
busters
0.75
Activations Density 0.015%