INDEX
    Explanations

    positive opinions about things

    statements about personal favorites or notable experiences in films or music

    New Auto-Interp
    Negative Logits
    iot
    -0.75
    iership
    -0.68
    FAQ
    -0.67
    angering
    -0.66
    threat
    -0.63
    wake
    -0.63
    ammers
    -0.62
    usercontent
    -0.62
    encies
    -0.61
    ught
    -0.61
    POSITIVE LOGITS
     definitely
    1.22
     certainly
    1.06
     undoubtedly
    1.04
     probably
    1.02
     arguably
    0.96
     obviously
    0.88
     basically
    0.87
     perhaps
    0.86
     undeniably
    0.85
     another
    0.83
    Act Density 0.309%

    No Known Activations