INDEX
    Explanations
    New Auto-Interp
    Head Attr Weights
    0:0.08
    1:0.08
    2:0.07
    3:0.08
    4:0.09
    5:0.09
    6:0.08
    7:0.07
    8:0.08
    9:0.07
    10:0.08
    11:0.06
    Negative Logits
    Ide
    -2.40
    Prof
    -2.31
    alde
    -2.27
     Tycoon
    -2.14
    rology
    -2.14
     '.
    -2.12
    YP
    -2.07
    Politics
    -2.04
     Particularly
    -2.04
     Podesta
    -2.01
    POSITIVE LOGITS
     cushion
    2.58
     rain
    2.45
     punch
    2.21
     taxi
    2.21
     punching
    2.17
     patrolling
    2.17
     taxis
    2.15
    phrine
    2.15
     lions
    2.14
    Pool
    2.11
    Act Density 0.000%

    No Known Activations