INDEX
Explanations
numerical identifiers or codes, particularly in bibliographic contexts
New Auto-Interp
Negative Logits
184
-0.25
185
-0.20
183
-0.20
182
-0.17
Shore
-0.16
193
-0.16
191
-0.16
195
-0.16
464
-0.16
192
-0.16
POSITIVE LOGITS
late
0.20
158
0.20
128
0.19
118
0.18
159
0.18
689
0.18
Late
0.17
138
0.17
268
0.17
178
0.17
Activations Density 0.050%