INDEX
Explanations
phrases related to trustworthy news or newsletters
questions about trustworthy news sources
New Auto-Interp
Negative Logits
ortium
-0.73
ioned
-0.70
onne
-0.68
pherd
-0.67
inguished
-0.66
subconscious
-0.65
edi
-0.63
phased
-0.62
ignt
-0.62
uctor
-0.62
POSITIVE LOGITS
iframe
0.69
Airl
0.68
BRE
0.67
Transcript
0.67
taboola
0.67
UNHCR
0.65
Cookies
0.64
Shelter
0.64
Afee
0.63
Subscribe
0.63
Activations Density 0.059%