INDEX
    Explanations

    topics related to raising awareness about various social issues

    New Auto-Interp
    Head Attr Weights
    0:0.01
    1:0.05
    2:0.07
    3:0.06
    4:0.01
    5:0.05
    6:0.04
    7:0.13
    8:0.09
    9:0.20
    10:0.08
    11:0.17
    Negative Logits
     onward
    -1.26
     onwards
    -1.22
    aults
    -1.20
    osure
    -1.15
    xes
    -1.14
    gui
    -1.14
    ornia
    -1.13
    robe
    -1.12
    inion
    -1.11
    eper
    -1.11
    POSITIVE LOGITS
     tattoo
    1.24
     Pengu
    1.17
     broadcasters
    1.12
     redes
    1.10
    PLA
    1.08
    STAR
    1.07
     newcom
    1.06
     tattoos
    1.05
    agnetic
    1.04
    paren
    1.03
    Act Density 0.005%

    No Known Activations