INDEX
Explanations
instances and expressions of love
New Auto-Interp
Negative Logits
Datuak
-1.12
decembrie
-0.74
standers
-0.71
ؤلاء
-0.67
octombrie
-0.65
tioners
-0.64
barra
-0.63
noiembrie
-0.63
طقة
-0.62
linkovi
-0.62
POSITIVE LOGITS
love
1.31
LOVE
1.30
Loves
1.25
LOVE
1.25
loves
1.19
Love
1.16
loves
1.14
Loves
1.13
loving
1.11
Love
1.09
Activations Density 0.045%