INDEX
Explanations
phrases related to interactions or opinions involving other people
references to social interactions and perceptions about other individuals
New Auto-Interp
Negative Logits
Accessory
-0.90
ouf
-0.65
Humane
-0.63
Arsenal
-0.62
Utility
-0.59
wagen
-0.59
Luck
-0.58
LOCK
-0.58
HUN
-0.57
Heist
-0.56
POSITIVE LOGITS
worldly
0.97
besides
0.93
describ
0.82
iations
0.74
than
0.73
than
0.70
paces
0.69
perspectives
0.69
equally
0.69
avenues
0.65
Activations Density 0.177%