INDEX
Explanations
references to individuals and individualism
New Auto-Interp
Negative Logits
dden
-0.17
گاÙĩ
-0.16
нд
-0.16
ëģĶ
-0.15
majority
-0.15
bulan
-0.14
ArgumentException
-0.14
own
-0.14
ylon
-0.13
IDEO
-0.13
POSITIVE LOGITS
ized
0.26
itarian
0.21
/group
0.21
ités
0.20
/groups
0.20
/team
0.20
ize
0.20
ity
0.19
hood
0.19
/single
0.18
Activations Density 0.029%