INDEX
Explanations
warm and affectionate speech
New Auto-Interp
Negative Logits
objectionable
0.88
deleterious
0.85
prosedur
0.84
상당히
0.83
Pareto
0.80
詭
0.80
necessitating
0.80
metodologia
0.79
വിശദ
0.79
prototypical
0.78
POSITIVE LOGITS
heartfelt
2.04
❤️
1.87
loving
1.79
cariño
1.78
hearts
1.75
❤️
1.73
hugs
1.68
heartwarming
1.67
friendship
1.66
cuddle
1.65
Activations Density 1.108%