INDEX
    Explanations

    words or phrases related to criticism and political statements

    sentences that include declarative statements, particularly those indicating serious issues or opinions

    New Auto-Interp
    Negative Logits
     volunte
    -0.85
     tremend
    -0.84
     gobl
    -0.80
     confir
    -0.73
     corrid
    -0.71
     millenn
    -0.71
     purse
    -0.70
     unnecess
    -0.69
     defic
    -0.69
     challeng
    -0.69
    POSITIVE LOGITS
     His
    2.12
     He
    2.11
    His
    1.79
    He
    1.66
     Himself
    1.47
    his
    1.38
     his
    1.26
     he
    1.25
     HIS
    1.24
     Asked
    1.15
    Act Density 0.607%

    No Known Activations