INDEX
Explanations
words related to conflicting or contradictory information
references to conflicting information or reports
New Auto-Interp
Negative Logits
chens
-0.81
abit
-0.79
rix
-0.76
aday
-0.76
fm
-0.75
ney
-0.74
frey
-0.74
bern
-0.73
rection
-0.73
adr
-0.73
POSITIVE LOGITS
conflicting
1.57
contradictory
1.30
undermin
0.99
discrep
0.97
conflicted
0.93
conflicts
0.89
sexes
0.88
guiActiveUn
0.85
predic
0.84
opinions
0.83
Activations Density 0.010%