INDEX
Explanations
sentences involving uncertainty or lack of knowledge
statements reflecting uncertainty or indecision
New Auto-Interp
Negative Logits
oppers
-0.72
exemplary
-0.68
denomin
-0.67
ielding
-0.67
absor
-0.67
ievers
-0.67
ensured
-0.66
mobil
-0.65
benefiting
-0.65
integral
-0.64
POSITIVE LOGITS
Maybe
1.09
maybe
1.04
Probably
1.02
Honestly
0.99
Anyway
0.95
maybe
0.93
:(
0.91
dunno
0.89
Maybe
0.87
Honestly
0.87
Activations Density 0.412%