INDEX
    Explanations

    phrases indicating claims, decisions, or assertions regarding political actions or accountability

    New Auto-Interp
    Head Attr Weights
    0:0.03
    1:0.02
    2:0.11
    3:0.08
    4:0.13
    5:0.05
    6:0.05
    7:0.05
    8:0.08
    9:0.08
    10:0.13
    11:0.13
    Negative Logits
    #$
    -1.43
     (&
    -1.37
     Bastard
    -1.33
    verages
    -1.32
    sbm
    -1.30
     Bundy
    -1.27
     Belief
    -1.24
    Awesome
    -1.24
    Break
    -1.23
    CLASS
    -1.22
    POSITIVE LOGITS
    ).[
    1.30
     boutique
    1.25
     hers
    1.24
     later
    1.23
    ordable
    1.22
     respectively
    1.19
     domestically
    1.19
     creditor
    1.16
    ".[
    1.15
    ourt
    1.15
    Act Density 0.017%

    No Known Activations