INDEX
Explanations
dates written in a specific format - month and year separated by a comma
numerical sequences or identifiers
New Auto-Interp
Negative Logits
exha
-0.70
Ging
-0.65
waivers
-0.63
NF
-0.62
Graves
-0.61
Fior
-0.60
DRAG
-0.60
assumptions
-0.60
consulting
-0.59
Kass
-0.59
POSITIVE LOGITS
606
0.97
708
0.95
20439
0.93
211
0.92
chev
0.92
307
0.91
806
0.91
285
0.91
wm
0.90
309
0.89
Activations Density 0.102%