INDEX
    Explanations

    phrases indicating approval or acceptance

    expressions conveying approval or acceptance

    New Auto-Interp
    Negative Logits
    marine
    -0.86
    chet
    -0.79
    hani
    -0.77
    riber
    -0.68
    cano
    -0.68
    ulhu
    -0.68
    arcity
    -0.66
    raped
    -0.63
    bane
    -0.62
    quin
    -0.62
    POSITIVE LOGITS
    AY
    0.83
    lahoma
    0.78
     okay
    0.78
     alright
    0.71
    ably
    0.68
     ol
    0.66
     BUT
    0.66
     enough
    0.65
     ok
    0.64
     OK
    0.64
    Act Density 0.023%

    No Known Activations