INDEX
Explanations
temporal references to years or ages
New Auto-Interp
Negative Logits
Ukra
-0.16
oldt
-0.15
Ctrls
-0.14
igger
-0.14
ÅĻiv
-0.14
year
-0.14
laus
-0.14
period
-0.14
iggers
-0.14
LES
-0.14
POSITIVE LOGITS
siguiente
0.23
suiv
0.21
siguientes
0.20
preced
0.20
poster
0.19
achten
0.18
anterior
0.18
Poster
0.17
poster
0.17
success
0.17
Activations Density 0.014%