INDEX
Explanations
colors and descriptive attributes associated with objects or characters
New Auto-Interp
Negative Logits
dark
-0.18
Silver
-0.15
doz
-0.15
redd
-0.15
Silver
-0.15
Sized
-0.15
Dark
-0.15
black
-0.15
Purple
-0.15
amber
-0.14
POSITIVE LOGITS
-and
0.33
ish
0.23
-colored
0.23
-striped
0.22
-bordered
0.22
-col
0.21
/red
0.21
èī²çļĦ
0.20
-highlight
0.20
-cl
0.19
Activations Density 0.084%