INDEX
    Explanations

    occurrences of the word "do" in various forms and contexts

    New Auto-Interp
    Negative Logits
    nya
    -0.18
    ka
    -0.18
    lify
    -0.17
    ervas
    -0.16
    noon
    -0.16
    uelle
    -0.15
    ulous
    -0.15
    wo
    -0.15
    elay
    -0.15
    apolis
    -0.15
    POSITIVE LOGITS
    ÅĤÄħ
    0.20
    zens
    0.19
    berman
    0.17
    ehler
    0.17
    ctype
    0.17
    osing
    0.16
    ob
    0.16
    ,
    0.15
    ress
    0.15
    leta
    0.15
    Act Density 0.065%

    No Known Activations