INDEX
Explanations
statements indicating the presence or absence of evidence or suggestions
phrases indicating a lack of evidence or denial of claims
New Auto-Interp
Negative Logits
cellaneous
-0.70
ione
-0.67
è»
-0.65
Boss
-0.65
Dialogue
-0.63
izont
-0.63
inic
-0.62
Pieces
-0.61
feet
-0.61
largeDownload
-0.61
POSITIVE LOGITS
anymore
1.05
slightest
0.91
anyone
0.86
bothered
0.84
contradicts
0.84
whatsoever
0.82
justifies
0.81
anybody
0.79
anything
0.77
bothers
0.76
Activations Density 0.158%