INDEX
Explanations
references to the prefix "neo-" in the text
references to neo-Nazi ideology and related terms
New Auto-Interp
Negative Logits
hips
-0.77
thumbnail
-0.72
lessly
-0.64
balloons
-0.63
jars
-0.62
loo
-0.61
reservoirs
-0.60
baskets
-0.60
Rings
-0.60
ositories
-0.59
POSITIVE LOGITS
Nazi
0.96
ge
0.90
-
0.89
ethical
0.89
christ
0.88
-)
0.88
engineering
0.88
femin
0.87
entimes
0.84
fascist
0.84
Activations Density 0.028%