INDEX
Explanations
expressions of deep emotional connections and commitments to personal beliefs or projects
New Auto-Interp
Negative Logits
Efq
-0.61
vece
-0.57
Monfieur
-0.57
oredCriteria
-0.55
Chriftian
-0.50
مشين
-0.49
telep
-0.49
Perſ
-0.49
Diſ
-0.49
avadoc
-0.49
POSITIVE LOGITS
dearly
0.92
paixão
0.89
love
0.85
dear
0.84
loves
0.83
favorite
0.80
dearest
0.79
favorite
0.78
sentimental
0.75
passion
0.74
Activations Density 0.136%