INDEX
Explanations
white supremacist and nationalist ideologies
New Auto-Interp
Negative Logits
Looking
0.57
IV
0.56
USING
0.55
Mumbai
0.55
R
0.55
Player
0.54
SER
0.54
Wid
0.54
D
0.54
*
0.54
POSITIVE LOGITS
supremac
0.69
supremacist
0.66
shovel
0.65
clarified
0.65
clarifications
0.63
archived
0.61
occidental
0.61
glaciers
0.61
Charlottesville
0.59
bicyclists
0.59
Activations Density 0.009%