INDEX
Explanations
the phrase "not sure."
expressions of uncertainty or doubt
New Auto-Interp
Negative Logits
Ĥİ
-0.78
emonium
-0.77
verse
-0.75
inator
-0.67
%]
-0.65
cano
-0.65
ufact
-0.63
inators
-0.63
ulus
-0.61
gencies
-0.61
POSITIVE LOGITS
why
1.11
whether
1.10
how
1.10
about
0.91
if
0.90
exactly
0.87
WHY
0.87
why
0.84
what
0.82
anymore
0.74
Activations Density 0.045%