INDEX
Explanations
action sentences that involve either hypothetical or speculative scenarios
speculative phrases or statements that imply uncertainty or possibility
New Auto-Interp
Negative Logits
athing
-0.76
bread
-0.71
ohan
-0.69
raint
-0.69
rake
-0.67
Virgin
-0.66
ussions
-0.65
ride
-0.65
Driver
-0.64
verse
-0.63
POSITIVE LOGITS
misunder
0.87
alternatively
0.85
incent
0.76
unintentional
0.72
idon
0.70
summarized
0.70
©¶æ
0.70
argued
0.69
intu
0.67
exacerbated
0.65
Activations Density 0.168%