INDEX
    Explanations

    Number abbreviations

    New Auto-Interp
    Negative Logits
    ABC
    -0.11
    BP
    -0.10
    AMA
    -0.10
    KEA
    -0.10
    IPO
    -0.10
    OVA
    -0.09
    ABE
    -0.09
    10
    -0.09
    CCA
    -0.09
    ICP
    -0.09
    POSITIVE LOGITS
     tw
    0.22
     ninete
    0.20
     twenties
    0.19
     fours
    0.18
     Fs
    0.18
     ones
    0.18
     th
    0.17
     Tw
    0.17
     se
    0.16
     Ds
    0.16
    Act Density 0.062%

    No Known Activations