INDEX
Explanations
expressions of bias and subjective opinions
New Auto-Interp
Negative Logits
èĮ
-0.16
Appropri
-0.16
igu
-0.15
Hir
-0.14
rast
-0.14
(Void
-0.14
онÑĤ
-0.14
ermen
-0.14
æīķ
-0.13
engan
-0.13
POSITIVE LOGITS
biased
0.28
bias
0.24
Bias
0.20
biased
0.19
biases
0.18
bias
0.18
ince
0.18
partisan
0.18
ffi
0.16
.toObject
0.16
Activations Density 0.186%