INDEX
Explanations
references to the color black
references to "black" in various contexts
New Auto-Interp
Negative Logits
PsyNetMessage
-0.90
=-=-=-=-
-0.85
ablishment
-0.85
Lauder
-0.85
igslist
-0.84
ICLE
-0.83
OPLE
-0.82
mble
-0.78
xon
-0.77
sidx
-0.73
POSITIVE LOGITS
smith
1.20
jack
1.06
ened
1.01
berry
1.01
moon
0.91
hawk
0.91
horse
0.90
bird
0.89
beard
0.86
powder
0.85
Activations Density 0.034%