INDEX
Explanations
negation terms commonly used in legal contexts
New Auto-Interp
Negative Logits
ARAB
-0.78
fwd
-0.77
ódz
-0.74
pezi
-0.73
Eisenberg
-0.71
SAX
-0.71
pleaf
-0.69
SAX
-0.69
Arab
-0.68
fwd
-0.68
POSITIVE LOGITS
neither
1.18
neither
1.10
Nor
1.10
NOR
1.05
nor
1.04
nor
1.02
Nor
1.01
Norris
0.98
Neither
0.97
(!__
0.97
Activations Density 0.074%