INDEX
Explanations
references to facial expressions and emotional states
New Auto-Interp
Negative Logits
egin
-0.20
rollo
-0.17
dizzy
-0.15
urette
-0.14
ovsky
-0.14
mind
-0.14
buzz
-0.14
braco
-0.13
Sharper
-0.13
phet
-0.13
POSITIVE LOGITS
expression
0.43
expressions
0.38
Expression
0.37
expression
0.35
Expression
0.35
-expression
0.34
facial
0.33
expres
0.32
表æĥħ
0.30
Expressions
0.28
Activations Density 0.211%