INDEX
    Explanations

    phrases expressing contrast or contradiction

    phrases that introduce contrasting ideas or elaborations

    New Auto-Interp
    Negative Logits
    olars
    -0.87
    irm
    -0.70
    uay
    -0.68
    orc
    -0.67
    ory
    -0.65
    croft
    -0.64
    utch
    -0.63
    tty
    -0.63
    itch
    -0.63
    miss
    -0.63
    POSITIVE LOGITS
     also
    0.72
     Thumbnails
    0.63
    chery
    0.61
    Recomm
    0.61
     ALSO
    0.61
    cially
    0.60
    DES
    0.60
     simultaneously
    0.60
    also
    0.59
     crabs
    0.58
    Act Density 0.134%

    No Known Activations