INDEX
    Explanations

    the word "but" in various contexts

    New Auto-Interp
    Negative Logits
    ehr
    -0.17
    ipop
    -0.16
    dee
    -0.15
     chứ
    -0.15
     Guerrero
    -0.14
    ä¸ĬãģĮ
    -0.14
    osc
    -0.14
    iteur
    -0.14
    ãģ¦ãĤĤ
    -0.14
    assy
    -0.13
    POSITIVE LOGITS
     nice
    0.16
    ãĤ¹ãĤ«
    0.15
    nice
    0.14
     apparently
    0.14
     basically
    0.14
     briefly
    0.14
     maybe
    0.14
     Suff
    0.14
     nic
    0.13
    711
    0.13
    Act Density 0.133%

    No Known Activations