INDEX
Explanations
mentions of the color gray
references to the color gray
New Auto-Interp
Negative Logits
=-=-=-=-
-0.85
CVE
-0.83
ugal
-0.78
========
-0.78
ÄŁ
-0.77
olkien
-0.76
andals
-0.74
igslist
-0.73
̶
-0.72
ãĥ¼ãĥ³
-0.72
POSITIVE LOGITS
hound
1.01
beard
0.94
shading
0.91
haired
0.86
ish
0.85
gray
0.84
grey
0.80
linen
0.79
naire
0.77
slime
0.77
Activations Density 0.022%