INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Head Attr Weights
    0:0.08
    1:0.07
    2:0.08
    3:0.08
    4:0.08
    5:0.09
    6:0.08
    7:0.08
    8:0.08
    9:0.07
    10:0.08
    11:0.08
    Negative Logits
    arro
    -3.12
    otle
    -2.99
    uci
    -2.83
    utm
    -2.71
     cider
    -2.66
    abre
    -2.63
    ocl
    -2.60
    alde
    -2.58
     bleach
    -2.55
     DeL
    -2.55
    POSITIVE LOGITS
    iability
    2.48
     ­
    2.48
    Politics
    2.37
    2.35
    ]'
    2.35
     Socialism
    2.33
    2.27
    .]
    2.19
     Buildings
    2.18
     Mahmoud
    2.18
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.