INDEX
    Explanations

    conjunctions and other connecting words that indicate contrast or continuation in a discussion

    New Auto-Interp
    Negative Logits
    atego
    -0.15
    DataContract
    -0.15
    illard
    -0.15
    اÙģØª
    -0.15
    .mit
    -0.15
    øy
    -0.14
    awe
    -0.14
    idunt
    -0.14
    esco
    -0.14
    rière
    -0.14
    POSITIVE LOGITS
    Thumb
    0.16
    edit
    0.15
     thumbs
    0.15
    atab
    0.14
    Sibling
    0.14
    mae
    0.14
     пÑĢож
    0.14
     torch
    0.13
    cur
    0.13
     now
    0.13
    Act Density 0.002%

    No Known Activations