INDEX
    Explanations

    phrases related to critical commentary or negative viewpoints

    expressions of sentiment or opinion, especially negative emotions related to events or situations

    New Auto-Interp
    Negative Logits
    ?).
    -0.85
    .).
    -0.74
    etheless
    -0.74
    .*
    -0.69
    +.
    -0.66
    .)
    -0.66
     ).
    -0.64
    arthed
    -0.61
    odox
    -0.61
    arist
    -0.60
    POSITIVE LOGITS
    ,"
    1.13
    %"
    1.07
     [
    1.07
    .,"
    0.92
    ",
    0.92
    ":
    0.91
    ,'"
    0.89
    ,''
    0.85
    "]
    0.83
    "
    0.82
    Act Density 0.740%

    No Known Activations