INDEX
Explanations
narrative structures and moments of realization or reflection
New Auto-Interp
Negative Logits
ſeveral
-0.80
ſever
-0.75
myſelf
-0.74
ſelves
-0.73
Conſ
-0.73
leſs
-0.73
himſelf
-0.72
ſtate
-0.72
Cæsar
-0.67
itſelf
-0.67
POSITIVE LOGITS
when
0.96
when
0.80
quando
0.74
later
0.72
gdy
0.69
när
0.69
cuando
0.67
When
0.66
khi
0.66
after
0.63
Activations Density 0.229%