INDEX
Explanations
instances of hesitation or uncertainty in a narrative
New Auto-Interp
Negative Logits
hap
-0.16
luck
-0.15
YS
-0.14
fortunately
-0.14
ors
-0.14
alon
-0.14
ãĥ¼ãĥĦ
-0.14
ãĥ©ãĥĥãĤ¯
-0.14
versed
-0.13
xm
-0.13
POSITIVE LOGITS
man
0.38
boy
0.36
holy
0.35
lord
0.27
trust
0.27
HOL
0.26
Holy
0.26
boy
0.25
Boy
0.25
-boy
0.24
Activations Density 0.293%