INDEX
    Explanations

    statements about the attributes or actions of different groups of people

    phrases that indicate opinions or beliefs expressed by people

    New Auto-Interp
    Negative Logits
    comes
    -0.71
    Posts
    -0.66
    Appearances
    -0.62
    aign
    -0.61
     Cumber
    -0.59
    _.
    -0.59
    ãĥ¯
    -0.57
     è£ıè¦ļéĨĴ
    -0.57
     Wizard
    -0.57
     Lets
    -0.56
    POSITIVE LOGITS
     disapprove
    1.26
     prefer
    1.13
     regretted
    1.08
     dislike
    1.06
    've
    1.06
    're
    1.03
    'd
    1.01
     approve
    0.98
     intend
    0.98
     condone
    0.97
    Act Density 0.105%

    No Known Activations