INDEX
    Explanations
    New Auto-Interp
    Head Attr Weights
    0:0.07
    1:0.12
    2:0.04
    3:0.04
    4:0.04
    5:0.27
    6:0.04
    7:0.03
    8:0.04
    9:0.05
    10:0.14
    11:0.05
    Negative Logits
    CN
    -2.06
     downstream
    -1.90
     located
    -1.90
     locations
    -1.71
    ItemImage
    -1.67
    addons
    -1.62
    -1.58
     overhe
    -1.58
     originate
    -1.57
     Marketplace
    -1.56
    POSITIVE LOGITS
     rul
    2.15
    querque
    2.02
    chwitz
    1.98
    ardless
    1.96
     punishment
    1.90
    arnaev
    1.88
     stigma
    1.86
     ga
    1.76
     sentencing
    1.72
    ety
    1.72
    Act Density 0.001%

    No Known Activations