INDEX
Explanations
phrases that indicate actions or transitions in narratives
New Auto-Interp
Negative Logits
orts
-0.16
yelled
-0.14
urr
-0.14
excer
-0.14
_listen
-0.13
ã썿ĢĿ
-0.13
ÙĨÙħ
-0.13
verbatim
-0.13
portrayed
-0.13
obra
-0.13
POSITIVE LOGITS
describe
0.41
explain
0.38
describing
0.37
describes
0.36
mention
0.35
talk
0.34
descri
0.34
explaining
0.33
describe
0.32
discuss
0.31
Activations Density 0.055%