INDEX
    Explanations

    phrases contrasting two different perspectives or pieces of information

    contrasting ideas or perspectives

    New Auto-Interp
    Negative Logits
    semb
    -0.66
    asley
    -0.64
    76561
    -0.63
    STAR
    -0.63
     Saras
    -0.62
     Annotations
    -0.61
    stein
    -0.61
     suffice
    -0.59
    before
    -0.59
    icago
    -0.58
    POSITIVE LOGITS
     srfAttach
    0.77
     opposite
    0.73
    )].
    0.65
    cul
    0.65
    Õ
    0.64
     second
    0.63
     latter
    0.62
    ouple
    0.61
    middle
    0.60
    cum
    0.60
    Act Density 0.060%

    No Known Activations