INDEX
    Explanations

    auxiliary verbs, particularly forms of "did" and "does"

    New Auto-Interp
    Negative Logits
    seau
    -0.16
    athan
    -0.15
    never
    -0.15
    ád
    -0.15
    unga
    -0.14
     никогда
    -0.14
    è©
    -0.14
    oster
    -0.14
    aybe
    -0.14
     nunca
    -0.14
    POSITIVE LOGITS
     indeed
    0.35
     everything
    0.27
     not
    0.24
     nothing
    0.23
    inde
    0.23
     Indeed
    0.22
    actic
    0.21
    Indeed
    0.20
     what
    0.20
     exactly
    0.20
    Act Density 0.085%

    No Known Activations