INDEX
    Explanations

    mentions of excuses in various contexts

    New Auto-Interp
    Negative Logits
    aris
    -0.07
    cket
    -0.07
     Dün
    -0.07
    ongan
    -0.06
    nds
    -0.06
     dư
    -0.06
    /MIT
    -0.06
    seo
    -0.06
     Aux
    -0.06
    /or
    -0.06
    POSITIVE LOGITS
     excuse
    0.07
    adoo
    0.07
    ably
    0.07
    stell
    0.07
    ously
    0.06
    ÙĪØµ
    0.06
    oppel
    0.06
    -num
    0.06
    889
    0.06
    æ¸Ī
    0.06
    Act Density 0.006%

    No Known Activations