INDEX
Explanations
emotional expressions or reactions
New Auto-Interp
Negative Logits
=
-0.67
Â
-0.65
Â
-0.64
â
-0.63
Ã
-0.63
apatalk
-0.62
${\-0.61
-0.58
${\-0.57
冏
-0.57
POSITIVE LOGITS
🥺
0.88
abt
0.84
✨
0.77
😭😭
0.75
🥺
0.75
😭
0.73
🥲
0.73
,,,
0.73
👀
0.73
💀
0.71
Activations Density 0.193%