INDEX
Explanations
inconsistencies and contradictions in narratives or reports
New Auto-Interp
Negative Logits
doma
-0.16
ceptive
-0.15
_:*
-0.14
parer
-0.13
umbo
-0.13
isini
-0.13
reminded
-0.13
ä»Ĭå¹´
-0.13
herits
-0.13
ê´
-0.13
POSITIVE LOGITS
story
0.22
version
0.21
timeline
0.19
claims
0.19
versions
0.19
claims
0.19
claimed
0.18
Claims
0.18
accounts
0.18
accounts
0.18
Activations Density 0.224%