INDEX
Explanations
references to location and contextual elements within a narrative
New Auto-Interp
Negative Logits
ads
-0.15
witch
-0.14
κÏħ
-0.14
Staten
-0.14
ugu
-0.14
brav
-0.14
ans
-0.14
iod
-0.14
iece
-0.14
ipse
-0.14
POSITIVE LOGITS
Nap
0.28
Wa
0.26
Levin
0.24
Poverty
0.24
Haw
0.24
Hamilton
0.24
Ta
0.23
Bulls
0.23
Mast
0.23
Morr
0.23
Activations Density 0.010%