INDEX
    Explanations

    pronouns "they" and "it"

    New Auto-Interp
    Negative Logits
    witter
    -0.07
     wholly
    -0.07
     ولي
    -0.06
    是一
    -0.06
     Tulsa
    -0.06
    uzzi
    -0.06
    Urls
    -0.06
    Cards
    -0.06
    ,'\
    -0.06
    ICLES
    -0.06
    POSITIVE LOGITS
     akka
    0.07
    /em
    0.07
    …it
    0.07
     it
    0.07
    -it
    0.06
    ทาน
    0.06
    cheduled
    0.06
    -net
    0.06
    0.06
    _sup
    0.06
    Act Density 0.021%

    No Known Activations