INDEX
    Explanations

    names, specifically those with the sequence "na" with varying levels of specificity

    occurrences of the substring "na" within words

    New Auto-Interp
    Negative Logits
    ienced
    -0.81
    neys
    -0.79
    ======
    -0.71
    tails
    -0.71
    ansas
    -0.70
    raved
    -0.69
    loo
    -0.66
    layer
    -0.66
    birds
    -0.64
    wolves
    -0.62
    POSITIVE LOGITS
    eus
    1.24
    uthor
    1.19
    vel
    0.92
    ples
    0.91
    isance
    0.90
    ACP
    0.87
    ïve
    0.86
    emi
    0.85
    veland
    0.81
    ñ
    0.80
    Act Density 0.033%

    No Known Activations