INDEX
    Explanations

    instances of the word "name" and its variations

    New Auto-Interp
    Negative Logits
    ness
    -0.18
    tica
    -0.18
    nds
    -0.18
    _named
    -0.17
    roy
    -0.16
    rego
    -0.16
    NESS
    -0.16
    neau
    -0.15
    ngo
    -0.15
    nas
    -0.15
    POSITIVE LOGITS
    plate
    0.45
    ake
    0.41
    plates
    0.39
    less
    0.33
     sake
    0.31
    cheap
    0.29
    AKE
    0.29
    akes
    0.29
    lessly
    0.27
    paced
    0.27
    Act Density 0.118%

    No Known Activations