INDEX
Explanations
phrases related to uncertainty or speculation
phrases that express uncertainty or questions about knowledge and opinions
New Auto-Interp
Negative Logits
srfAttach
-0.58
bryce
-0.55
catentry
-0.54
respectively
-0.52
anwhile
-0.50
accompanied
-0.50
Pwr
-0.49
runoff
-0.47
refill
-0.47
ornia
-0.46
POSITIVE LOGITS
miscar
0.56
spoilers
0.53
nodd
0.53
_>
0.52
explanations
0.52
motives
0.50
mistakes
0.50
vez
0.49
endings
0.47
;)
0.46
Activations Density 2.179%