INDEX
Explanations
words related to providing evidence for or against a statement or claim
terms related to validating, disproving, or substantiating claims
New Auto-Interp
Negative Logits
ebus
-0.80
NetMessage
-0.77
iew
-0.69
aned
-0.67
artney
-0.67
oppy
-0.67
atra
-0.65
azard
-0.65
obbies
-0.64
ixtape
-0.64
POSITIVE LOGITS
corrobor
1.18
substant
1.12
dispro
1.10
debunked
1.03
hypotheses
1.01
refute
0.99
debunk
0.99
assertions
0.95
evidence
0.94
unfounded
0.90
Activations Density 0.101%