INDEX
Explanations
phrases indicating thoughts or beliefs
phrases that express thoughts, beliefs, or assumptions
New Auto-Interp
Negative Logits
verting
-0.77
Rated
-0.68
hs
-0.67
md
-0.67
versions
-0.65
recorded
-0.65
unker
-0.64
draft
-0.62
jab
-0.62
ping
-0.61
POSITIVE LOGITS
enance
0.82
logically
0.79
forgiven
0.71
DragonMagazine
0.70
gat
0.68
Esports
0.66
Santos
0.65
otherwise
0.60
Moreno
0.60
THAT
0.59
Activations Density 0.083%