INDEX
Explanations
instances of specific numerical identifiers and references to amending documents or cases
New Auto-Interp
Negative Logits
Houſe
-0.77
houſe
-0.71
<unused62>
-0.69
purpoſe
-0.67
<unused63>
-0.67
<unused61>
-0.66
pleaſure
-0.66
auroit
-0.65
<unused60>
-0.64
Diſ
-0.64
POSITIVE LOGITS
<bos>
1.77
’
1.07
'
0.87
he
0.64
the
0.63
‘
0.59
those
0.58
in
0.57
‘
0.55
’,
0.55
Activations Density 0.000%