INDEX
Explanations
colors and color-related words
references to colors
New Auto-Interp
Negative Logits
doms
-0.99
_-
-0.94
idem
-0.90
=-=-=-=-
-0.74
llah
-0.74
rican
-0.73
uthor
-0.69
ARTICLE
-0.68
ICLE
-0.68
ickson
-0.68
POSITIVE LOGITS
palette
1.19
colors
1.10
colours
1.04
color
0.98
color
0.95
dye
0.94
anguage
0.93
tint
0.93
colour
0.92
pace
0.89
Activations Density 0.019%