INDEX
Explanations
claims or assertions being made in the text
references to assertions and claims regarding social or political issues
New Auto-Interp
Negative Logits
foreseen
-0.79
endars
-0.74
wills
-0.71
keys
-0.70
Lumpur
-0.70
artney
-0.67
thumbnails
-0.67
ioch
-0.67
Skydragon
-0.65
NetMessage
-0.65
POSITIVE LOGITS
uttered
0.92
argument
0.91
debunked
0.89
dispro
0.87
assertion
0.84
premise
0.84
refuted
0.84
asserted
0.82
vehemently
0.80
ument
0.78
Activations Density 0.179%