INDEX
    Explanations

    instances of the word "but" in various forms and contexts

    New Auto-Interp
    Negative Logits
    бо
    -0.16
    ziel
    -0.16
    olly
    -0.16
     therefore
    -0.15
    duct
    -0.15
    §
    -0.15
     alike
    -0.15
    uppe
    -0.15
    beth
    -0.15
    ops
    -0.15
    POSITIVE LOGITS
    chers
    0.26
    ters
    0.25
    term
    0.25
    lers
    0.25
    åĩ¡
    0.23
    tpl
    0.22
    rint
    0.22
    ler
    0.22
    ts
    0.22
    ressing
    0.22
    Act Density 0.049%

    No Known Activations