INDEX
    Explanations

    negations and expressions of personal opinions or feelings

    New Auto-Interp
    Negative Logits
    erval
    -0.17
    oun
    -0.15
    ouz
    -0.15
    eniable
    -0.14
    adden
    -0.14
    osaur
    -0.14
    ously
    -0.14
     Král
    -0.13
    uin
    -0.13
    anel
    -0.13
    POSITIVE LOGITS
     who
    0.31
    who
    0.28
     beg
    0.23
     Who
    0.22
    Who
    0.21
     shr
    0.20
     Meh
    0.20
     shrugged
    0.19
     qui
    0.19
     thems
    0.18
    Act Density 0.221%

    No Known Activations