INDEX
    Explanations

    occurrences of the word "le" and its variations

    New Auto-Interp
    Negative Logits
    ri
    -0.18
    resse
    -0.17
    ga
    -0.16
    dre
    -0.16
    re
    -0.15
    illes
    -0.15
    rought
    -0.15
    gli
    -0.14
    illet
    -0.14
    éra
    -0.14
    POSITIVE LOGITS
    opard
    0.22
    aky
    0.20
    aving
    0.20
    aping
    0.19
    eks
    0.17
    wd
    0.17
    eward
    0.17
    ettle
    0.17
    islation
    0.17
    prech
    0.17
    Act Density 0.034%

    No Known Activations