INDEX
Explanations
phrases related to confirming information or confirming a situation
references to conspiracy theories and investigations
New Auto-Interp
Negative Logits
Interstitial
-0.83
intensive
-0.75
Incre
-0.74
Advice
-0.73
Obst
-0.72
barriers
-0.72
flexibility
-0.70
ļéĨĴ
-0.70
progressively
-0.70
ted
-0.69
POSITIVE LOGITS
indeed
1.24
authentic
0.94
hoax
0.91
faked
0.89
existed
0.88
truth
0.88
exist
0.88
llah
0.87
authenticity
0.86
existent
0.85
Activations Density 0.649%