INDEX
Explanations
references to membership or participation in organizations or groups
New Auto-Interp
Negative Logits
totaled
-0.16
vertiser
-0.14
totaling
-0.14
coni
-0.14
[assembly
-0.14
ripp
-0.14
alk
-0.14
Ã¥l
-0.14
ots
-0.13
ĶåĽŀ
-0.13
POSITIVE LOGITS
ãģĨãģ¡
0.38
mostly
0.29
majority
0.26
åħ¶ä¸Ń
0.25
including
0.25
most
0.24
mostly
0.24
ëĮĢë¶Ģë¶Ħ
0.24
ï¼Įåħ¶ä¸Ń
0.23
davon
0.23
Activations Density 0.193%