INDEX
    Explanations

    instances of the letter "d" in various cases (upper and lower)

    New Auto-Interp
    Negative Logits
    andro
    -0.15
    oleon
    -0.15
    ITCH
    -0.15
    INGER
    -0.15
    iero
    -0.15
    INVAL
    -0.15
    alon
    -0.14
    dro
    -0.14
    zek
    -0.14
    abwe
    -0.14
    POSITIVE LOGITS
    istinguished
    0.33
    istingu
    0.29
    ipl
    0.28
    rama
    0.25
    iversity
    0.25
    etermination
    0.25
    ign
    0.23
    istinguish
    0.23
    eline
    0.23
    etailed
    0.23
    Act Density 0.038%

    No Known Activations