INDEX
Explanations
contradictory statements or information
New Auto-Interp
Negative Logits
emetery
-0.70
lov
-0.69
iatrics
-0.69
ubis
-0.66
ourning
-0.64
dain
-0.63
llular
-0.61
ixtape
-0.61
gins
-0.61
fare
-0.60
POSITIVE LOGITS
contradictory
0.78
conflicting
0.72
viewpoints
0.71
substant
0.70
contradictions
0.68
contradict
0.66
xual
0.65
juxtap
0.63
sexes
0.60
sides
0.59
Activations Density 11.198%