INDEX
Explanations
references to subplots in narratives
New Auto-Interp
Negative Logits
s
-0.32
ups
-0.27
scription
-0.24
ify
-0.24
ably
-0.23
ulative
-0.22
house
-0.21
t
-0.21
uptools
-0.19
ogonal
-0.19
POSITIVE LOGITS
ordinated
0.24
ordinates
0.21
stract
0.20
woo
0.20
redits
0.20
mers
0.19
suming
0.19
terr
0.18
redit
0.17
tel
0.17
Activations Density 0.010%