INDEX
Explanations
words related to colors
references to pale-colored items or concepts
New Auto-Interp
Negative Logits
ACTIONS
-0.66
loudly
-0.63
æ©Ł
-0.62
loose
-0.62
malfunction
-0.61
DOM
-0.60
ÄŁ
-0.60
jammed
-0.57
////
-0.56
Sequence
-0.55
POSITIVE LOGITS
olithic
1.51
ozo
1.17
ohyd
1.10
ocl
1.00
ogen
1.00
ogram
0.97
ogene
0.96
opath
0.96
ont
0.95
croft
0.94
Activations Density 0.034%