INDEX
Explanations
expressions of care, concern, and acknowledgment in interpersonal relationships
New Auto-Interp
Negative Logits
reve
-0.15
nika
-0.15
probe
-0.14
urette
-0.14
yz
-0.14
lew
-0.14
ãĥ¼ãĤ¸
-0.13
Zug
-0.13
tel
-0.13
лив
-0.13
POSITIVE LOGITS
бол
0.17
indeed
0.16
oud
0.16
cared
0.15
importance
0.15
cares
0.15
Ø´ÙħاÙĦÛĮ
0.15
ythe
0.15
oli
0.14
cip
0.14
Activations Density 0.143%