INDEX
Explanations
terms related to anti-something products or concepts
terms related to anti-technology or anti-product positions
New Auto-Interp
Negative Logits
marches
-0.75
nces
-0.74
Klux
-0.73
precincts
-0.70
aturdays
-0.69
embraces
-0.69
interviews
-0.69
democratically
-0.68
aturday
-0.67
Reich
-0.67
POSITIVE LOGITS
inflammatory
0.97
absor
0.94
hazard
0.93
gravity
0.93
cheat
0.92
avoid
0.92
orb
0.91
vir
0.87
destruct
0.87
oxide
0.87
Activations Density 0.063%