INDEX
Explanations
mentions of the color white
references to "White" in various contexts
New Auto-Interp
Negative Logits
ITAL
-0.84
Occup
-0.76
WATCHED
-0.75
ENDED
-0.72
odcast
-0.71
cffffcc
-0.70
Completed
-0.70
obbies
-0.70
raints
-0.70
INGTON
-0.67
POSITIVE LOGITS
caps
1.13
house
1.09
berry
1.03
horse
0.99
supremacist
0.98
supremacists
0.91
white
0.89
White
0.89
houses
0.88
mouth
0.86
Activations Density 0.019%