INDEX
Explanations
words indicating evaluation or judgment, particularly regarding communication and social expectations
New Auto-Interp
Negative Logits
.DEFINE
-0.15
Colbert
-0.15
azu
-0.14
emet
-0.14
alendar
-0.14
RUS
-0.14
zung
-0.14
าศ
-0.14
lesia
-0.14
SU
-0.14
POSITIVE LOGITS
Mort
0.32
mort
0.27
mort
0.26
-mort
0.25
Mortgage
0.25
Tor
0.24
Tor
0.24
Stock
0.21
mortgage
0.21
Hed
0.20
Activations Density 0.004%