INDEX
Explanations
mentions of sources of light or their characteristics
references to various forms of "lights."
New Auto-Interp
Negative Logits
via
-0.71
ese
-0.68
Attempts
-0.65
Relations
-0.65
clair
-0.64
coli
-0.64
gress
-0.63
ESE
-0.62
Interview
-0.62
Expression
-0.61
POSITIVE LOGITS
pots
1.01
lights
1.00
bulb
0.99
creen
0.98
bulbs
0.94
peed
0.93
lights
0.91
aber
0.91
ilver
0.91
torches
0.90
Activations Density 0.009%