INDEX
    Explanations

    words related to a positive human interaction

    New Auto-Interp
    Negative Logits
     friendly
    -2.28
    friendly
    -2.02
     Friendly
    -1.99
    Friendly
    -1.95
     friendliness
    -1.63
     FRIEND
    -1.51
     unfriendly
    -1.49
    FRIEND
    -1.38
     vriende
    -1.27
    vriende
    -1.24
    POSITIVE LOGITS
    AddField
    0.49
    ConfigureAwait
    0.47
     معت
    0.47
    missione
    0.43
    [
    0.41
     IGN
    0.41
     zaś
    0.40
    expect
    0.40
     expect
    0.40
    اغ
    0.40
    Act Density 5.484%

    No Known Activations