INDEX
Explanations
expressions of disappointment regarding performances or productions
New Auto-Interp
Negative Logits
upy
-0.15
μη
-0.15
------+------+
-0.15
etri
-0.14
OLON
-0.14
Sandbox
-0.14
borr
-0.14
KI
-0.14
Greenland
-0.13
سرد
-0.13
POSITIVE LOGITS
Beast
0.41
Belle
0.38
Beauty
0.38
Gast
0.35
Beauty
0.33
Disney
0.32
Disney
0.28
beast
0.28
Lum
0.28
belle
0.25
Activations Density 0.014%