INDEX
    Explanations

    numerical references, citations, and formal identifiers in a document

    New Auto-Interp
    Negative Logits
     alone
    -0.14
    bum
    -0.14
    etal
    -0.14
    ercial
    -0.14
     scratch
    -0.14
     fit
    -0.14
     Nov
    -0.13
     Hers
    -0.13
     post
    -0.13
     lets
    -0.13
    POSITIVE LOGITS
    åıĤ
    0.22
     Cf
    0.22
    See
    0.21
     See
    0.21
    see
    0.21
     see
    0.20
     åıĤ
    0.19
     cf
    0.19
    cf
    0.18
    onaut
    0.17
    Act Density 0.178%

    No Known Activations