INDEX
    Explanations

    mention of the word "la"

    instances of the substring "la"

    New Auto-Interp
    Negative Logits
    lessly
    -0.81
    manship
    -0.80
    lers
    -0.74
    ELL
    -0.71
    worthiness
    -0.70
    liners
    -0.70
    states
    -0.67
    wolves
    -0.66
    starter
    -0.66
    sets
    -0.65
    POSITIVE LOGITS
    uthor
    1.07
    uder
    1.00
    pling
    0.90
    uren
    0.89
    veland
    0.89
    ibrary
    0.89
    very
    0.85
    fts
    0.83
    ques
    0.82
    phia
    0.82
    Act Density 0.011%

    No Known Activations