INDEX
Explanations
adjectives relating to color, specifically the color red
references to the color red
New Auto-Interp
Negative Logits
ernel
-0.81
agall
-0.74
awaru
-0.72
Ö¼
-0.70
Reloaded
-0.68
vre
-0.68
UGH
-0.67
XT
-0.67
Lank
-0.65
ILA
-0.65
POSITIVE LOGITS
efined
1.23
iscovered
1.14
irection
1.13
neck
1.13
oubt
1.12
iscover
1.12
iscovery
1.11
rawn
1.08
oub
1.04
velvet
1.00
Activations Density 0.026%