INDEX
Explanations
emotional expressions and relationships in personal narratives
New Auto-Interp
Negative Logits
once
-0.26
once
-0.23
Once
-0.21
Once
-0.20
wherever
-0.16
etto
-0.16
amma
-0.15
_once
-0.15
aga
-0.14
antan
-0.14
POSITIVE LOGITS
when
0.24
followed
0.23
when
0.21
.when
0.20
quando
0.19
gdy
0.18
cuando
0.17
until
0.17
عÙĨدÙħا
0.17
When
0.17
Activations Density 0.014%