INDEX
Explanations
phrases or words associated with biased reporting
references to bias in media reporting
New Auto-Interp
Negative Logits
ft
-0.85
clamation
-0.82
ptoms
-0.82
âĢ¢âĢ¢âĢ¢âĢ¢
-0.77
phrase
-0.76
hner
-0.72
chal
-0.72
thur
-0.71
forts
-0.71
mia
-0.70
POSITIVE LOGITS
biased
1.28
unbiased
1.08
impartial
0.97
biases
0.79
undermin
0.78
opinions
0.78
bias
0.77
observers
0.77
viewpoints
0.75
citiz
0.74
Activations Density 0.014%