INDEX
Explanations
mathematical concepts and structures in formal proofs
New Auto-Interp
Negative Logits
egie
-0.17
atables
-0.16
shoulder
-0.15
lap
-0.15
oes
-0.15
parity
-0.15
bin
-0.15
ksam
-0.14
bud
-0.14
tor
-0.14
POSITIVE LOGITS
prox
0.28
prox
0.24
coco
0.23
bile
0.19
Clarke
0.18
coder
0.18
Rock
0.18
proximity
0.17
Arm
0.17
Nem
0.17
Activations Density 0.008%