INDEX
    Explanations

    sentences involving admiration or positive sentiment towards qualities or actions of people

    expressions of opinion about people and their characteristics

    New Auto-Interp
    Negative Logits
    etheless
    -0.67
    *.
    -0.49
     Beir
    -0.47
    ãĤ´ãĥ³
    -0.45
    evidence
    -0.45
     Frie
    -0.43
    ãĤ¢ãĥ«
    -0.43
    appropriately
    -0.42
    amera
    -0.42
    issance
    -0.42
    POSITIVE LOGITS
     however
    0.52
     tho
    0.50
     alot
    0.43
     though
    0.43
    !)
    0.41
     mag
    0.40
    natureconservancy
    0.40
     google
    0.39
     lay
    0.39
    ?)
    0.38
    Act Density 2.691%

    No Known Activations