INDEX
    Explanations

    statements related to current events, politics, and policy

    positive sentiments and expressions of approval related to governance or policies

    New Auto-Interp
    Negative Logits
    ãĤ´ãĥ³
    -1.02
    etheless
    -0.88
    quished
    -0.73
    .).
    -0.72
    ãĢĤ
    -0.71
    .(
    -0.70
    %.
    -0.68
    ?).
    -0.67
    ().
    -0.66
     Annotations
    -0.66
    POSITIVE LOGITS
    ,'"
    1.48
    ,"
    1.45
     [
    1.28
    ),"
    1.27
    ,''
    1.20
    .,"
    1.06
    ',"
    1.05
    ,'
    1.05
     ['
    1.05
    %"
    1.05
    Act Density 1.399%

    No Known Activations