INDEX
Explanations
approximations or estimates relating to quantities or measurements
New Auto-Interp
Negative Logits
'
-0.58
I
-0.56
II
-0.52
Action
-0.51
and
-0.51
sen
-0.50
ding
-0.49
D
-0.49
’
-0.49
Action
-0.49
POSITIVE LOGITS
approximately
1.74
approximately
1.68
approx
1.64
approxim
1.64
Approx
1.60
Approximately
1.59
approx
1.53
Approximately
1.52
approximate
1.51
Approx
1.49
Activations Density 0.186%