INDEX
    Explanations

    the word "however" and its variations, indicating a contrast or exception in the text

    New Auto-Interp
    Negative Logits
    sel
    -0.18
    sb
    -0.16
    uras
    -0.16
    ilia
    -0.16
    s
    -0.15
    uro
    -0.15
    amber
    -0.15
    sik
    -0.15
    inel
    -0.15
    ers
    -0.15
    POSITIVE LOGITS
    forth
    0.15
    ÑģÑıÑĤ
    0.15
    ucch
    0.15
    ìĤ¬íķŃ
    0.14
     latter
    0.14
    fois
    0.14
    itage
    0.14
    isto
    0.13
    że
    0.13
    tility
    0.13
    Act Density 0.047%

    No Known Activations