INDEX
Explanations
instances of the word "then" as a temporal marker in the text
New Auto-Interp
Negative Logits
Harlow
-0.77
Newsom
-0.73
Irm
-0.73
ejus
-0.71
Marge
-0.70
лися
-0.70
Vip
-0.70
Folsom
-0.69
Winfrey
-0.69
Bär
-0.69
POSITIVE LOGITS
then
1.65
THEN
1.63
THEN
1.56
Then
1.47
then
1.43
Then
1.39
entonces
1.25
dann
1.17
Entonces
1.15
então
1.13
Activations Density 0.082%