INDEX
    Explanations

    pronouns and definite articles in the text

    New Auto-Interp
    Negative Logits
    aget
    -0.15
    upertino
    -0.15
    ho
    -0.15
    silver
    -0.14
    »
    -0.13
     brightest
    -0.13
    innamon
    -0.13
     slee
    -0.13
     Buen
    -0.13
     Wikipedia
    -0.13
    POSITIVE LOGITS
    odable
    0.16
    andel
    0.16
    £
    0.15
    erver
    0.15
    üle
    0.15
    elan
    0.14
    ortex
    0.14
    abela
    0.14
    alon
    0.14
    rych
    0.14
    Act Density 0.011%

    No Known Activations