INDEX
Explanations
references to physical appearances, especially facial expressions and features
New Auto-Interp
Negative Logits
udes
-0.19
ieder
-0.17
овÑĸд
-0.17
cedure
-0.17
esus
-0.16
AYOUT
-0.15
141
-0.15
ency
-0.15
AndView
-0.15
aptor
-0.14
POSITIVE LOGITS
/head
0.18
Rooney
0.17
(face
0.15
ushima
0.14
-face
0.14
ually
0.14
candy
0.13
faces
0.13
/body
0.13
-quote
0.13
Activations Density 0.036%