INDEX
Explanations
references to statistical or evaluative statements regarding social issues
New Auto-Interp
Negative Logits
his
-0.22
his
-0.18
uzzi
-0.17
ä»ĸçļĦ
-0.17
ê·¸ìĿĺ
-0.17
è¾ħ
-0.15
ÃľR
-0.15
strcasecmp
-0.15
lán
-0.15
HIS
-0.14
POSITIVE LOGITS
he
0.28
ä»ĸ
0.21
он
0.20
He
0.19
she
0.19
вÑĸн
0.18
reference
0.17
He
0.17
point
0.17
Echo
0.17
Activations Density 0.166%