INDEX
    Explanations

    words indicating presence or absence

    New Auto-Interp
    Negative Logits
    ryn
    -0.16
    enor
    -0.15
    OI
    -0.15
    cia
    -0.14
     hasher
    -0.14
    irsch
    -0.14
     poultry
    -0.13
    ollar
    -0.13
    PFN
    -0.13
    νε
    -0.13
    POSITIVE LOGITS
     ones
    0.30
    ones
    0.20
    them
    0.18
     Ones
    0.18
     hers
    0.18
    cka
    0.17
    176
    0.17
     mine
    0.17
    ck
    0.16
     Mine
    0.16
    Act Density 0.004%

    No Known Activations