INDEX
Explanations
expressions of gratitude
New Auto-Interp
Negative Logits
ocurrido
-0.68
Castor
-0.67
اصله
-0.64
seur
-0.63
$
-0.63
öbb
-0.62
Ivoire
-0.62
وتسجيلات
-0.62
s
-0.60
obicei
-0.60
POSITIVE LOGITS
kyou
1.11
Thank
1.05
thank
1.04
thank
1.02
Thank
0.99
THANK
0.90
thanks
0.88
imageNamed
0.87
THANK
0.84
thanks
0.82
Activations Density 0.034%