INDEX
Explanations
references to human anatomy, specifically focusing on the head and its features
New Auto-Interp
Negative Logits
priv
-0.17
[OF
-0.15
@student
-0.15
(LP
-0.14
.observable
-0.14
priv
-0.14
odega
-0.14
iquid
-0.14
dirs
-0.14
pect
-0.14
POSITIVE LOGITS
/head
0.21
/body
0.20
raki
0.15
Bender
0.15
ume
0.15
Quint
0.14
wig
0.14
Formula
0.14
Masks
0.13
facial
0.13
Activations Density 0.073%