INDEX
    Explanations

    references to specific groups or individuals related to societal issues

    New Auto-Interp
    Head Attr Weights
    0:0.02
    1:0.02
    2:0.08
    3:0.11
    4:0.41
    5:0.03
    6:0.05
    7:0.05
    8:0.03
    9:0.04
    10:0.06
    11:0.04
    Negative Logits
     rout
    -1.84
    ゴン
    -1.57
     Cheong
    -1.52
     insign
    -1.44
     ted
    -1.42
     consolation
    -1.41
    obin
    -1.38
    *.
    -1.37
     fixme
    -1.34
     Abyssal
    -1.34
    POSITIVE LOGITS
    ographers
    1.87
    esters
    1.87
    ammers
    1.75
    ographer
    1.72
    writers
    1.68
    ists
    1.67
    rafted
    1.66
    achers
    1.65
    nai
    1.61
    elected
    1.61
    Act Density 0.020%

    No Known Activations