INDEX
Explanations
playful romantic utterances
New Auto-Interp
Negative Logits
ائع
0.45
अनेक
0.44
🙏🙏
0.43
近年
0.43
क्षित
0.43
Remain
0.43
!\
0.42
!}\
0.42
%!
0.42
!}{0.42
POSITIVE LOGITS
smirk
0.63
Boyfriend
0.61
bored
0.59
😘
0.59
😘
0.58
boyfriend
0.58
sweetheart
0.57
jealous
0.56
babe
0.55
💋
0.54
Activations Density 0.024%