INDEX
Explanations
references to interacting with or being around other people
references to social interactions with others
New Auto-Interp
Negative Logits
convergence
-0.76
ories
-0.70
Nanto
-0.62
MIT
-0.60
deterrence
-0.60
tnc
-0.59
Industrial
-0.59
adobe
-0.59
CONCLUS
-0.59
modernization
-0.57
POSITIVE LOGITS
whom
1.14
else
1.01
who
0.97
soever
0.91
else
0.85
folk
0.83
intimately
0.83
friend
0.82
ÃŃs
0.79
imity
0.79
Activations Density 0.321%