INDEX
    Explanations

    phrases related to public statements or declarations

    references to comments and remarks, particularly in a political context

    New Auto-Interp
    Negative Logits
    bid
    -0.77
    fare
    -0.75
    ISH
    -0.73
    PG
    -0.69
    ccording
    -0.66
    tis
    -0.66
    YP
    -0.64
    tail
    -0.63
    fruit
    -0.63
    ishes
    -0.62
    POSITIVE LOGITS
     uttered
    1.10
     aloud
    0.93
     remarks
    0.92
     regarding
    0.90
     comments
    0.88
     pertaining
    0.83
     dispar
    0.82
     attributed
    0.82
     about
    0.79
     praising
    0.78
    Act Density 0.056%

    No Known Activations