INDEX
Explanations
programming-related terms, particularly related to code execution and configuration
New Auto-Interp
Negative Logits
AIDS
-0.82
Oprah
-0.73
assault
-0.71
Alabama
-0.70
America
-0.69
ampunk
-0.69
Americ
-0.67
African
-0.66
America
-0.66
arians
-0.66
POSITIVE LOGITS
bunch
1.06
static
1.02
subset
1.01
comma
1.00
separate
0.99
single
0.98
few
0.98
recursive
0.97
dummy
0.97
simple
0.96
Activations Density 0.190%