INDEX
    Explanations

    phrases that describe a comparison or equivalence between different entities or concepts

    phrases that indicate equivalences or comparisons

    New Auto-Interp
    Negative Logits
     nonetheless
    -0.80
    bender
    -0.78
    erer
    -0.78
    hess
    -0.73
     nevertheless
    -0.70
     trave
    -0.67
     Became
    -0.64
    hran
    -0.64
    erers
    -0.62
     furthermore
    -0.62
    POSITIVE LOGITS
    lihood
    0.94
    Sov
    0.74
    anus
    0.74
     Schr
    0.64
    othing
    0.63
    onymous
    0.61
     blackmail
    0.61
     subsistence
    0.61
    otin
    0.61
     proverbial
    0.58
    Act Density 0.086%

    No Known Activations