INDEX
Explanations
number sequences mixed with miscellaneous text
numerical values, particularly those related to dates and quantities
New Auto-Interp
Negative Logits
appeal
-0.72
corrid
-0.69
SUPPORT
-0.69
cycle
-0.69
plateau
-0.67
restrictive
-0.67
relevance
-0.64
snowball
-0.64
transact
-0.64
optimistic
-0.63
POSITIVE LOGITS
wm
0.91
Ever
0.90
soever
0.89
tm
0.88
pac
0.87
e
0.87
cott
0.87
20439
0.85
eu
0.85
hyde
0.84
Activations Density 0.057%