INDEX
Explanations
details about family structure and living arrangements
New Auto-Interp
Negative Logits
myself
-0.18
conc
-0.17
girlfriend
-0.15
oneself
-0.15
esses
-0.14
ä¼Ļ
-0.14
solo
-0.14
Assembly
-0.14
colleg
-0.14
football
-0.14
POSITIVE LOGITS
themselves
0.33
their
0.20
thems
0.20
Their
0.20
Their
0.20
both
0.19
yourselves
0.19
BOTH
0.19
their
0.18
обо
0.18
Activations Density 0.370%