INDEX
    Explanations

    references to various societal issues and systems

    New Auto-Interp
    Head Attr Weights
    0:0.03
    1:0.05
    2:0.35
    3:0.08
    4:0.07
    5:0.07
    6:0.08
    7:0.03
    8:0.05
    9:0.03
    10:0.05
    11:0.05
    Negative Logits
    )=(
    -1.63
    ertodd
    -1.52
    rame
    -1.51
    ayan
    -1.49
     cand
    -1.47
    ivan
    -1.45
    rar
    -1.44
     Scand
    -1.44
    ograph
    -1.41
    ensical
    -1.40
    POSITIVE LOGITS
    cause
    1.65
    ngth
    1.64
    selves
    1.61
    1.54
    ylum
    1.52
     selves
    1.52
     outwe
    1.51
     repertoire
    1.50
    glers
    1.50
    asca
    1.49
    Act Density 0.136%

    No Known Activations