INDEX
Explanations
terms associated with falsehood or deception
New Auto-Interp
Negative Logits
AssemblyTitle
-0.77
rungsseite
-0.67
LookAnd
-0.57
typelib
-0.56
RectangleBorder
-0.55
Rakyat
-0.54
MainAxisSize
-0.53
Lähteet
-0.52
Portail
-0.52
betweenstory
-0.51
POSITIVE LOGITS
false
2.02
False
1.75
false
1.59
False
1.55
FALSE
1.22
falsos
1.22
falso
1.19
falsa
1.18
falsas
1.16
FALSE
1.09
Activations Density 0.140%