INDEX
    Explanations

    phrases indicating alignment or consistency with specific principles or standards

    phrases indicating alignment or agreement

    New Auto-Interp
    Negative Logits
    chat
    -0.89
    quer
    -0.77
    rub
    -0.72
    ©¶æ¥µ
    -0.68
    chen
    -0.68
    alk
    -0.68
    gging
    -0.67
    ond
    -0.66
    Chat
    -0.66
    oiler
    -0.65
    POSITIVE LOGITS
     regard
    0.99
     regards
    0.96
     respect
    0.89
     expectations
    0.78
     tradition
    0.78
    lihood
    0.74
     impunity
    0.74
    rium
    0.73
     ideals
    0.72
     precaution
    0.72
    Act Density 0.074%

    No Known Activations