INDEX
Explanations
proper nouns containing color descriptors
occurrences of the word "White."
New Auto-Interp
Negative Logits
udeau
-0.85
ITAL
-0.84
cffffcc
-0.82
ategory
-0.81
Downloadha
-0.79
yrinth
-0.75
odcast
-0.74
ension
-0.73
WATCHED
-0.73
awaru
-0.73
POSITIVE LOGITS
caps
1.11
supremacist
1.08
house
1.05
Sox
1.05
supremacists
1.04
horse
1.02
berry
0.97
beard
0.95
bread
0.95
houses
0.94
Activations Density 0.029%