INDEX
Explanations
words related to caring or concern for others
expressions of care or concern for others
New Auto-Interp
Negative Logits
prohibition
-0.81
è¦ļéĨĴ
-0.73
ãĥ³ãĤ¸
-0.71
transcript
-0.65
resume
-0.63
resumes
-0.63
perjury
-0.62
confirmation
-0.62
Yugoslavia
-0.61
projection
-0.59
POSITIVE LOGITS
cared
1.16
lessly
1.02
taker
0.93
amily
0.78
giving
0.77
rals
0.76
fully
0.73
der
0.73
lest
0.73
lington
0.72
Activations Density 0.012%