INDEX
    Explanations

    phrases related to comparison or contrast

    New Auto-Interp
    Negative Logits
     agre
    -0.68
    yll
    -0.65
    ular
    -0.60
    Interstitial
    -0.59
    ians
    -0.59
    unicip
    -0.58
    heit
    -0.57
    usage
    -0.57
     mas
    -0.56
    formance
    -0.55
    POSITIVE LOGITS
     thirds
    0.68
    ses
    0.63
    oused
    0.63
     finalists
    0.61
     Cups
    0.61
     theirs
    0.60
     Colo
    0.59
    eely
    0.59
    legged
    0.58
    milo
    0.58
    Act Density 0.056%

    No Known Activations