INDEX
    Explanations

    negative sentiment or expressions of distress

    New Auto-Interp
    Negative Logits
    stdc
    -0.71
     виправивши
    -0.66
    himself
    -0.64
     sendiri
    -0.62
     himself
    -0.59
     IMDG
    -0.58
    เอง
    -0.57
    imageio
    -0.57
    aculture
    -0.56
    awtextra
    -0.56
    POSITIVE LOGITS
    "));
    
    0.67
     "));
    0.67
    tanie
    0.66
     claim
    0.64
     CHtml
    0.64
    ритори
    0.63
    ."],
    0.63
    iseta
    0.62
     Lave
    0.62
    )";
    
    0.62
    Act Density 0.009%

    No Known Activations