INDEX
Explanations
human expressions and reactions
New Auto-Interp
Negative Logits
sürekli
0.38
趵
0.38
тров
0.37
screaming
0.36
continually
0.34
ለያ
0.34
continuously
0.34
解决了
0.34
/*/
0.33
अक्टूबर
0.33
POSITIVE LOGITS
smiled
1.01
nodded
0.97
chuckled
0.95
replied
0.92
laughed
0.91
chuckle
0.91
smile
0.90
shrug
0.85
reply
0.85
nodding
0.85
Activations Density 0.027%