INDEX
Explanations
mentions of the word "majority" in various contexts
New Auto-Interp
Negative Logits
ian
-0.18
all
-0.17
icamente
-0.16
eno
-0.16
baum
-0.16
more
-0.16
ible
-0.16
ania
-0.15
iam
-0.15
ong
-0.15
POSITIVE LOGITS
/min
0.23
itarian
0.20
opinion
0.19
-min
0.16
acci
0.16
-Muslim
0.16
whip
0.16
-major
0.16
itä
0.16
vote
0.15
Activations Density 0.014%