INDEX
Explanations
expressions of love and affection
New Auto-Interp
Negative Logits
TagHelper
-0.59
aandacht
-0.59
Efq
-0.57
xhttp
-0.56
cdti
-0.55
jooq
-0.54
sexe
-0.53
themſelves
-0.51
ſel
-0.51
poupée
-0.51
POSITIVE LOGITS
loves
1.31
loved
1.11
LOVED
1.05
loved
0.97
Loves
0.97
Loved
0.96
hated
0.96
likes
0.96
loves
0.95
liked
0.94
Activations Density 0.112%