INDEX
    Explanations

    the repeated occurrence of the substring "ne" within words

    New Auto-Interp
    Negative Logits
    rador
    -0.96
    hips
    -0.83
    rament
    -0.80
    inarily
    -0.79
    allery
    -0.77
     glim
    -0.76
    orsi
    -0.76
    rican
    -0.74
    enhagen
    -0.74
    ENCY
    -0.73
    POSITIVE LOGITS
    arest
    0.97
    cht
    0.96
    jad
    0.94
    phrine
    0.90
    verend
    0.90
    zel
    0.89
    gan
    0.89
    cks
    0.88
    gger
    0.86
    zi
    0.86
    Act Density 0.022%

    No Known Activations