INDEX
Explanations
expressions of approval, disapproval, and commendation in the context of organizational or societal matters
New Auto-Interp
Negative Logits
efe
-0.17
³
-0.17
Spinner
-0.15
ean
-0.15
IRROR
-0.14
.relu
-0.14
lez
-0.14
ừ
-0.14
ulan
-0.14
ead
-0.14
POSITIVE LOGITS
/owl
0.16
Ø·Ùģ
0.15
istrovstvÃŃ
0.14
pha
0.14
δά
0.14
åĮ
0.14
íĬ¹íŀĪ
0.14
Michaels
0.14
bip
0.13
_PF
0.13
Activations Density 0.169%