INDEX
Explanations
phrases related to treatment and perceptions of individuals in social contexts
New Auto-Interp
Negative Logits
ⓧ
-0.44
readily
-0.43
noDo
-0.43
يتيمه
-0.42
GEBURTSDATUM
-0.42
successful
-0.41
さすがに
-0.41
reliable
-0.39
successful
-0.39
ausreich
-0.38
POSITIVE LOGITS
differently
1.24
according
0.71
diffé
0.71
similarly
0.66
inconsist
0.66
Differ
0.63
differ
0.62
accordingly
0.61
incorrectly
0.61
according
0.58
Activations Density 0.907%