INDEX
    Explanations

    phrases indicating contrast or contradiction

    phrases indicating contrasts or antonyms

    New Auto-Interp
    Negative Logits
    lished
    -0.83
    zeb
    -0.82
    ufact
    -0.80
     Mush
    -0.76
     Annotations
    -0.76
    uala
    -0.75
    ondon
    -0.72
    atche
    -0.71
    beans
    -0.69
    anches
    -0.68
    POSITIVE LOGITS
     approach
    0.76
    icter
    0.74
     minded
    0.74
    etheless
    0.74
     scenario
    0.72
    isphere
    0.71
     gender
    0.70
     side
    0.70
    osite
    0.69
    =#
    0.68
    Act Density 0.020%

    No Known Activations