INDEX
Explanations
pronouns or nouns denoting individuals distancing themselves from others or groups
New Auto-Interp
Negative Logits
understatement
-0.62
WORK
-0.62
roundup
-0.61
amental
-0.57
heny
-0.56
Stephenson
-0.56
visitation
-0.56
nightmares
-0.55
Fulton
-0.55
eyebrow
-0.54
POSITIVE LOGITS
selves
0.88
é¾įåĸļ士
0.81
creatively
0.79
ternally
0.77
DragonMagazine
0.77
territ
0.74
tremend
0.74
é¾įå¥ij士
0.71
withd
0.71
éĸ
0.70
Activations Density 0.043%