INDEX
Explanations
narrative elements involving characters and their interactions in a story
New Auto-Interp
Negative Logits
weren
-0.23
estaba
-0.21
were
-0.21
yoktu
-0.21
étaient
-0.21
estava
-0.19
бÑĭли
-0.19
vardı
-0.19
ήÏĦαν
-0.19
était
-0.18
POSITIVE LOGITS
does
0.26
íķľëĭ¤
0.24
íķľëĭ¤
0.23
иваеÑĤÑģÑı
0.22
ëIJľëĭ¤
0.21
ëĬĶëĭ¤
0.21
becomes
0.20
does
0.20
DOES
0.20
goes
0.20
Activations Density 0.121%