INDEX
Explanations
references to reports, documents, and legal terminology
New Auto-Interp
Negative Logits
ehler
-0.16
stehen
-0.15
.chapter
-0.14
Corner
-0.14
compose
-0.14
actionDate
-0.14
compose
-0.14
corner
-0.14
ylene
-0.13
ermo
-0.13
POSITIVE LOGITS
contains
0.29
mention
0.25
contains
0.24
contain
0.24
containing
0.23
mentions
0.23
states
0.22
reference
0.21
says
0.21
states
0.20
Activations Density 0.199%