INDEX
Explanations
phrases related to criticism of the mainstream media
New Auto-Interp
Negative Logits
cific
-0.81
arcity
-0.76
Tome
-0.74
uana
-0.73
atoon
-0.72
heng
-0.69
alid
-0.69
otos
-0.69
ursed
-0.68
heed
-0.67
POSITIVE LOGITS
ization
0.97
media
0.94
outlets
0.94
ing
0.89
isation
0.88
ership
0.85
mainstream
0.83
ed
0.79
ized
0.78
ised
0.78
Activations Density 0.079%