INDEX
    Explanations

    concepts related to social issues and structures

    New Auto-Interp
    Head Attr Weights
    0:0.18
    1:0.03
    2:0.01
    3:0.11
    4:0.28
    5:0.05
    6:0.03
    7:0.02
    8:0.11
    9:0.07
    10:0.01
    11:0.03
    Negative Logits
    )",
    -1.99
    ラン
    -1.97
    ilaterally
    -1.76
    FINE
    -1.72
    ファ
    -1.66
     GER
    -1.65
    Victoria
    -1.64
    eteria
    -1.59
     Carrie
    -1.58
    actionDate
    -1.54
    POSITIVE LOGITS
     represents
    3.24
     extends
    3.17
     underscores
    2.88
     illustrates
    2.75
     reflects
    2.60
     lends
    2.60
     resembles
    2.56
     embodies
    2.55
     does
    2.53
     constitutes
    2.53
    Act Density 0.054%

    No Known Activations