INDEX
Explanations
phrases indicating a choice or alternative option
phrases or words indicating alternative perspectives or choices
New Auto-Interp
Negative Logits
urated
-0.77
oru
-0.70
usters
-0.70
osponsors
-0.67
anmar
-0.65
gur
-0.65
amia
-0.64
rys
-0.63
ardy
-0.63
rush
-0.62
POSITIVE LOGITS
sucks
0.75
imaginable
0.72
else
0.70
depends
0.68
thereafter
0.67
construed
0.65
soever
0.65
preferable
0.65
>>>>
0.64
event
0.63
Activations Density 0.039%