INDEX
Explanations
phrases expressing disbelief or skepticism
New Auto-Interp
Negative Logits
catentry
-0.82
Survivors
-0.69
lez
-0.69
Orange
-0.63
Jelly
-0.62
Runner
-0.61
Closing
-0.61
skeleton
-0.60
Starts
-0.60
Ripple
-0.59
POSITIVE LOGITS
suddenly
0.84
knowingly
0.80
equate
0.77
DonaldTrump
0.77
willingly
0.72
intend
0.72
rogram
0.70
Canaver
0.70
condone
0.70
owe
0.69
Activations Density 0.174%