INDEX
Explanations
phrases describing brightness or vividness
New Auto-Interp
Negative Logits
nd
-0.15
gba
-0.15
akin
-0.15
tones
-0.14
ese
-0.14
ÌĤ
-0.14
ug
-0.14
plevel
-0.14
alog
-0.14
quet
-0.14
POSITIVE LOGITS
ãĥ«ãĥī
0.17
rier
0.15
-eyed
0.15
êµ´
0.15
sik
0.15
yw
0.15
-light
0.14
èĢĢ
0.14
lil
0.14
onte
0.14
Activations Density 0.095%