INDEX
Explanations
proper nouns related to politics and business
references to prominent individuals associated with controversy
New Auto-Interp
Negative Logits
ivity
-0.86
eur
-0.80
ive
-0.79
wives
-0.78
VID
-0.76
ired
-0.73
isl
-0.70
Ns
-0.70
APS
-0.70
wife
-0.70
POSITIVE LOGITS
iannopoulos
1.22
Yiannopoulos
1.10
andowski
0.88
atan
0.88
chwitz
0.80
Gork
0.78
scl
0.77
enty
0.76
ij士
0.75
memos
0.73
Activations Density 0.016%