INDEX
Explanations
mentions of specific characters or figures in a narrative
New Auto-Interp
Negative Logits
ĶåĽŀ
-0.15
Kore
-0.14
Orr
-0.14
Mare
-0.14
oracle
-0.14
fü
-0.14
dig
-0.13
uong
-0.13
ová
-0.13
Gros
-0.13
POSITIVE LOGITS
eyh
0.15
аÑĢод
0.15
нед
0.14
gee
0.14
lap
0.14
criminal
0.14
Noir
0.13
andon
0.13
çĤİ
0.13
Touches
0.13
Activations Density 0.015%