INDEX
    Explanations

    instances of the word "names."

    names or labels

    repeated mentions of the word "names."

    New Auto-Interp
    Negative Logits
    yrinth
    -0.71
    irth
    -0.70
     Bed
    -0.69
    OPLE
    -0.67
     Yar
    -0.66
     Smy
    -0.65
    romy
    -0.64
    UGE
    -0.62
     Forestry
    -0.62
    idth
    -0.61
    POSITIVE LOGITS
    paces
    1.58
    pace
    1.10
     names
    1.04
    plates
    1.01
     aliases
    1.01
    paced
    0.96
    ames
    0.96
    hips
    0.92
    akes
    0.91
    names
    0.88
    Act Density 0.015%

    No Known Activations