INDEX
    Explanations

    mathematical concepts and structures in formal proofs

    New Auto-Interp
    Negative Logits
    egie
    -0.17
    atables
    -0.16
     shoulder
    -0.15
     lap
    -0.15
    oes
    -0.15
     parity
    -0.15
     bin
    -0.15
    ksam
    -0.14
     bud
    -0.14
    tor
    -0.14
    POSITIVE LOGITS
     prox
    0.28
    prox
    0.24
     coco
    0.23
     bile
    0.19
     Clarke
    0.18
     coder
    0.18
     Rock
    0.18
     proximity
    0.17
     Arm
    0.17
     Nem
    0.17
    Act Density 0.008%

    No Known Activations