INDEX
Explanations
references to Donald Trump
New Auto-Interp
Negative Logits
favored
-0.18
Slut
-0.15
otte
-0.15
adel
-0.15
analyze
-0.15
Savior
-0.15
defense
-0.15
Defense
-0.15
ops
-0.14
-defense
-0.14
POSITIVE LOGITS
Guardian
0.25
guard
0.20
US
0.20
uard
0.19
Guard
0.19
Sir
0.17
Telegraph
0.17
زÛĮ
0.17
guardian
0.17
Photograph
0.16
Activations Density 0.254%