INDEX
Explanations
attends to instances of the term "minutes" from arbitrary later time-related expressions
New Auto-Interp
Head Attr Weights
0:0.79
1:0.01
2:0.01
3:0.02
4:0.08
5:0.04
6:0.01
7:0.02
Negative Logits
to
-0.43
and
-0.41
or
-0.40
in
-0.39
as
-0.39
.
-0.38
for
-0.38
,
-0.38
even
-0.37
(
-0.37
POSITIVE LOGITS
myſelf
0.81
itſelf
0.80
pleaſure
0.74
Monfieur
0.71
themſelves
0.71
Chriftian
0.71
Theſe
0.71
houſe
0.70
Diſ
0.68
purpoſe
0.68
Activations Density 0.015%