INDEX
Explanations
connective phrases indicating progression or continuity in narratives
New Auto-Interp
Negative Logits
634
-0.16
Interpret
-0.16
si
-0.14
perceived
-0.14
interpret
-0.14
ilon
-0.14
ussed
-0.14
Perception
-0.13
ings
-0.13
AME
-0.13
POSITIVE LOGITS
hearing
0.52
hear
0.46
Hearing
0.42
hear
0.42
hears
0.39
knowing
0.38
know
0.37
Hear
0.35
know
0.35
heard
0.34
Activations Density 0.018%