INDEX
    Explanations

    words indicating a strong negative opinion or action

    New Auto-Interp
    Negative Logits
    crow
    -0.72
    essed
    -0.65
     Pressure
    -0.64
    eman
    -0.64
    nings
    -0.63
    ammy
    -0.61
    hma
    -0.60
    ILY
    -0.59
    isson
    -0.58
    imir
    -0.58
    POSITIVE LOGITS
    tering
    0.80
     ç¥ŀ
    0.73
     :(
    0.71
     guiActiveUn
    0.71
    ascript
    0.71
    ainer
    0.69
    heartedly
    0.66
    ingu
    0.65
     except
    0.64
    onne
    0.63
    Act Density 0.024%

    No Known Activations