INDEX
    Explanations

    terms and phrases related to political manipulation and critique

    New Auto-Interp
    Head Attr Weights
    0:0.06
    1:0.01
    2:0.06
    3:0.43
    4:0.04
    5:0.06
    6:0.02
    7:0.05
    8:0.03
    9:0.02
    10:0.14
    11:0.02
    Negative Logits
     pairs
    -2.73
     pads
    -2.58
     beds
    -2.56
     rooms
    -2.53
    english
    -2.48
     ranges
    -2.42
     servers
    -2.41
     monitors
    -2.39
     streams
    -2.38
     locations
    -2.38
    POSITIVE LOGITS
     betrayal
    3.23
     accomplishment
    3.07
     fallacy
    3.02
     setback
    3.00
     folly
    2.99
     inev
    2.90
     heresy
    2.88
     inconvenience
    2.87
     disgrace
    2.87
     imposition
    2.85
    Act Density 0.725%

    No Known Activations