INDEX
Explanations
references to characters and their relationships in narratives
New Auto-Interp
Negative Logits
aeper
-0.16
142
-0.16
ungi
-0.14
izr
-0.14
Ñģи
-0.14
ÅĽnie
-0.14
ibri
-0.14
ëĬ
-0.14
ark
-0.13
illis
-0.13
POSITIVE LOGITS
whom
0.39
who
0.24
whose
0.24
who
0.18
introdu
0.18
whose
0.17
اÙĦذÙĬÙĨ
0.16
kteÅĻÃŃ
0.15
ixe
0.14
们
0.14
Activations Density 0.711%