INDEX
Explanations
colors being described or mentioned
references to colors
New Auto-Interp
Negative Logits
doms
-0.84
idem
-0.79
_-
-0.77
ren
-0.76
TI
-0.76
York
-0.72
GS
-0.72
ern
-0.70
ctors
-0.68
Xi
-0.67
POSITIVE LOGITS
colors
1.35
colours
1.26
pace
1.08
palette
1.03
color
0.99
creen
0.97
dots
0.95
tint
0.94
anguage
0.94
colored
0.93
Activations Density 0.011%