INDEX
    Explanations

    references to whistleblowers and related terminology

    New Auto-Interp
    Head Attr Weights
    0:0.03
    1:0.01
    2:0.05
    3:0.06
    4:0.05
    5:0.03
    6:0.39
    7:0.08
    8:0.04
    9:0.05
    10:0.11
    11:0.05
    Negative Logits
    States
    -1.51
     Calculator
    -1.44
     Monetary
    -1.42
    ICAN
    -1.40
    -1.38
    WAY
    -1.37
     affinity
    -1.37
    ONSORED
    -1.36
    Chan
    -1.35
    uala
    -1.35
    POSITIVE LOGITS
    glers
    1.75
    sed
    1.71
    etheus
    1.60
    nsics
    1.43
    iren
    1.43
    irens
    1.40
    gging
    1.38
     crack
    1.38
    killed
    1.38
    puff
    1.37
    Act Density 0.001%

    No Known Activations