INDEX
Explanations
instances of embarrassment or socially awkward situations
New Auto-Interp
Negative Logits
صوتيه
-0.62
насељу
-0.55
שוליים
-0.51
enumii
-0.48
avid
-0.47
regado
-0.47
PyLong
-0.47
pouss
-0.47
Benedikt
-0.47
fapt
-0.46
POSITIVE LOGITS
embarrassment
0.90
embarrassed
0.83
Embar
0.80
Embar
0.79
appearance
0.79
المناصب
0.77
embarrassing
0.74
Etiquette
0.74
Appearance
0.73
etiquette
0.73
Activations Density 0.343%