INDEX
Explanations
expressions of emotion or non-verbal communication
expressions of emotions or reactions
New Auto-Interp
Negative Logits
arthed
-0.84
artifacts
-0.76
OTOS
-0.76
icipated
-0.75
sites
-0.74
senal
-0.73
ategory
-0.72
irements
-0.70
inction
-0.70
conting
-0.70
POSITIVE LOGITS
grin
1.10
smile
0.95
grinning
0.93
impatient
0.92
smiling
0.92
roar
0.90
gigg
0.88
angrily
0.87
exclaim
0.87
smug
0.86
Activations Density 0.370%