INDEX
    Explanations

    phrases related to writing, opinions, and statements of facts

    expressions of complex emotional experiences and annotations related to writing

    New Auto-Interp
    Negative Logits
    )."
    -0.82
    .""
    -0.68
    .).
    -0.61
    )"
    -0.59
    '."
    -0.58
    ]."
    -0.57
    Lv
    -0.56
    .''
    -0.55
    .'"
    -0.55
    .",
    -0.53
    POSITIVE LOGITS
     Canaver
    0.70
    Spoiler
    0.65
    etheless
    0.59
     explanations
    0.54
     Berks
    0.52
     libertarians
    0.51
     forgiven
    0.51
     awfully
    0.50
     whistleblowers
    0.50
     Krugman
    0.50
    Act Density 2.609%

    No Known Activations