INDEX
    Explanations

    instances of intimidation and manipulation in various contexts

    New Auto-Interp
    Head Attr Weights
    0:0.03
    1:0.02
    2:0.05
    3:0.06
    4:0.13
    5:0.03
    6:0.04
    7:0.36
    8:0.03
    9:0.03
    10:0.09
    11:0.09
    Negative Logits
    variable
    -1.45
    rh
    -1.35
     Somewhere
    -1.31
    fixed
    -1.31
     salvage
    -1.28
    album
    -1.27
    amy
    -1.27
     Imaging
    -1.27
    ゴン
    -1.23
     miracle
    -1.21
    POSITIVE LOGITS
     opponents
    1.72
     foes
    1.70
     passers
    1.65
     subordinates
    1.64
     intimidated
    1.58
     challengers
    1.57
     adversaries
    1.54
     superiors
    1.53
     bullies
    1.48
     merciless
    1.43
    Act Density 0.003%

    No Known Activations