INDEX
Explanations
sentences starting with "The truth is" that provide a statement of fact or opinion
statements asserting a truth or fact
New Auto-Interp
Negative Logits
throp
-0.78
inges
-0.68
uld
-0.66
andise
-0.66
ivil
-0.65
pled
-0.64
acies
-0.63
ypes
-0.62
eor
-0.62
asts
-0.62
POSITIVE LOGITS
not
0.86
NOT
0.85
Solitaire
0.80
nothing
0.80
unclear
0.76
neither
0.75
olation
0.75
Reviewer
0.74
none
0.74
indeed
0.74
Activations Density 0.120%