INDEX
Explanations
instances of the term "fake news" with varying emphases
references to "fake news" and related criticisms of media credibility
New Auto-Interp
Negative Logits
ktop
-0.86
artney
-0.83
aird
-0.82
illes
-0.79
onding
-0.74
atal
-0.74
foreseen
-0.73
union
-0.73
cise
-0.72
enture
-0.71
POSITIVE LOGITS
misinformation
1.07
disinformation
1.04
ument
1.04
falsehood
1.01
debunk
0.97
debunked
0.96
perpetrated
0.95
False
0.94
propag
0.90
nonsense
0.89
Activations Density 0.234%