INDEX
Explanations
mathematical notation related to subscripts and superscripts
New Auto-Interp
Negative Logits
en
-0.74
rubin
-0.73
MOC
-0.73
}_{-0.68
Mendez
-0.67
DPI
-0.67
Waugh
-0.66
McMillan
-0.65
########.
-0.65
Greenberg
-0.65
POSITIVE LOGITS
_{\2.07
}_{\1.70
)_{\1.57
}_{\1.57
_{\1.47
}}_{\1.33
]_{\1.19
\|_{\1.16
$_{\1.05
|_{\0.95
Activations Density 0.325%