INDEX
Explanations
have/had followed by temporal, duration, or state indicators
New Auto-Interp
Negative Logits
been
1.05
been
0.83
BEEN
0.82
Been
0.72
ollut
0.71
sido
0.68
været
0.66
été
0.63
Been
0.62
olnud
0.54
POSITIVE LOGITS
r
0.64
(
0.60
(
0.59
is
0.46
ergy
0.46
,
0.45
,
0.45
of
0.44
l
0.42
when
0.42
Activations Density 0.057%