INDEX
Explanations
expressions indicating hesitation or reluctance
New Auto-Interp
Negative Logits
beyond
-0.17
Beyond
-0.17
enz
-0.16
Wand
-0.16
illi
-0.14
almost
-0.14
ermalink
-0.14
ocused
-0.14
ÄĽti
-0.14
epar
-0.13
POSITIVE LOGITS
nor
0.59
Nor
0.43
nor
0.43
Nor
0.41
anymore
0.34
NOR
0.33
either
0.30
EITHER
0.27
either
0.26
Either
0.25
Activations Density 0.298%