INDEX
    Explanations

    content related to socio-political issues and historical events in China

    New Auto-Interp
    Head Attr Weights
    0:0.03
    1:0.01
    2:0.11
    3:0.47
    4:0.07
    5:0.06
    6:0.02
    7:0.06
    8:0.03
    9:0.05
    10:0.01
    11:0.02
    Negative Logits
     ­
    -6.66
     […]
    -6.62
    -5.80
    -5.58
    -5.31
    …]
    -4.89
    -4.68
    -4.68
     🙂
    -4.63
     …"
    -4.62
    POSITIVE LOGITS
    --
    11.35
    !--
    8.70
     ``
    8.64
    )--
    8.14
    ``
    7.54
    .--
    7.29
    ---
    7.17
    ----
    7.02
    -->
    5.55
    ----------
    5.54
    Act Density 0.093%

    No Known Activations