INDEX
Explanations
phrases related to expectations and frustrations in social contexts
New Auto-Interp
Negative Logits
PRESSION
-0.16
omy
-0.15
ãĤ¥
-0.15
ãĥķãĤ
-0.15
.ToShort
-0.14
IDES
-0.14
FORMANCE
-0.14
eres
-0.13
ERGE
-0.13
Feder
-0.13
POSITIVE LOGITS
broken
0.16
_snap
0.15
Broken
0.15
diabetic
0.15
æĬĺ
0.14
Sokol
0.14
,
0.14
Adv
0.14
a
0.14
yled
0.14
Activations Density 0.038%