INDEX
Explanations
words related to social issues and political opinions
keywords associated with controversial social and political issues
New Auto-Interp
Negative Logits
Ern
-0.65
ength
-0.61
Hilbert
-0.60
Canaver
-0.60
ume
-0.59
Gil
-0.58
udo
-0.58
Nanto
-0.58
Engels
-0.58
Thumbnail
-0.58
POSITIVE LOGITS
bably
0.72
vantage
0.71
deserved
0.68
oud
0.66
ropri
0.63
ont
0.62
uci
0.62
oton
0.61
mon
0.60
osaurs
0.60
Activations Density 0.653%