INDEX
Explanations
the word "prior" in various contexts, indicating a focus on temporal references or previous events
New Auto-Interp
Negative Logits
chin
-0.16
kre
-0.16
iber
-0.15
apor
-0.15
bru
-0.15
sterol
-0.14
pool
-0.14
au
-0.14
else
-0.14
hrom
-0.14
POSITIVE LOGITS
/current
0.20
lush
0.15
lus
0.15
ê¹
0.15
-last
0.15
imes
0.15
wner
0.15
Äijây
0.14
/original
0.14
avad
0.14
Activations Density 0.013%