INDEX
Explanations
parentheses and numerical identifiers within technical or scientific contexts
New Auto-Interp
Negative Logits
dg
-0.18
cbc
-0.18
DK
-0.18
dff
-0.18
dcc
-0.18
DG
-0.17
Dmit
-0.17
dfa
-0.16
DFS
-0.16
DFS
-0.16
POSITIVE LOGITS
McD
0.32
WD
0.31
AD
0.28
SD
0.28
AD
0.28
LD
0.28
SD
0.27
GD
0.27
JD
0.26
GD
0.26
Activations Density 0.053%