INDEX
Explanations
expressions of affection and intimacy
New Auto-Interp
Negative Logits
interessieren
-0.35
interessiert
-0.34
伙
-0.34
potential
-0.32
-0.31
interested
-0.31
…
-0.30
заинтере
-0.30
利
-0.29
ve
-0.28
POSITIVE LOGITS
Personendaten
0.92
hugging
0.90
noDo
0.84
cuddling
0.83
tagext
0.81
hug
0.79
cuddle
0.77
hugs
0.76
ویکیپدی
0.74
hugged
0.73
Activations Density 0.214%