INDEX
Explanations
references to locations or settings within narratives
New Auto-Interp
Negative Logits
824
-0.17
achs
-0.17
άζ
-0.16
bekl
-0.15
ansom
-0.15
enet
-0.15
Wall
-0.14
WARN
-0.14
299
-0.14
isz
-0.14
POSITIVE LOGITS
completion
0.18
completion
0.16
offer
0.15
hire
0.15
Completion
0.15
-song
0.15
éĸĢ
0.14
allot
0.14
wu
0.14
Completion
0.14
Activations Density 0.098%