INDEX
Explanations
references to ethical concerns and issues related to accountability
preceding "and"
connecting words
New Auto-Interp
Negative Logits
@",
-0.81
)_/¯
-0.77
"):
-0.73
'):
-0.70
$")
-0.68
>({-0.66
__':
-0.66
')):
-0.65
NUMX
-0.64
)";
-0.64
POSITIVE LOGITS
and
3.09
และ
1.46
и
1.36
및
1.34
&
1.32
および
1.32
và
1.28
\&
1.18
and
1.15
及び
1.12
Activations Density 13.135%