INDEX
Explanations
phrases showing confusion or being confused
mentions of confusion and related states of uncertainty
New Auto-Interp
Negative Logits
ector
-0.69
bors
-0.64
tsky
-0.64
Janeiro
-0.63
Reviewer
-0.63
baugh
-0.63
ACH
-0.61
OHN
-0.61
iaries
-0.61
uner
-0.60
POSITIVE LOGITS
terminology
0.85
ABOUT
0.80
ingly
0.79
meanings
0.78
confuse
0.77
between
0.76
semantics
0.76
otomy
0.74
notions
0.73
channelAvailability
0.71
Activations Density 0.128%