INDEX
Explanations
phrases indicating upcoming elaboration or emphasis in speech
expressions of uncertainty or ambivalence about a situation
New Auto-Interp
Negative Logits
ascript
-0.76
cific
-0.76
ithing
-0.73
igent
-0.71
stone
-0.71
aintain
-0.69
imeter
-0.65
hower
-0.64
ourses
-0.64
atana
-0.63
POSITIVE LOGITS
kidding
0.90
damned
0.76
oops
0.73
hilar
0.73
Surprise
0.73
Creep
0.71
downright
0.71
damn
0.70
Bucs
0.70
darn
0.69
Activations Density 1.200%