INDEX
Explanations
references to time durations
New Auto-Interp
Negative Logits
unt
-0.17
erez
-0.16
upon
-0.15
for
-0.15
upon
-0.15
UNT
-0.15
later
-0.14
apon
-0.14
ucci
-0.14
resh
-0.14
POSITIVE LOGITS
leading
0.31
Leading
0.25
leading
0.25
Leading
0.24
directly
0.23
immediately
0.21
following
0.20
immedi
0.19
takip
0.18
following
0.17
Activations Density 0.033%