INDEX
    Explanations

    proper nouns, particularly names associated with notable individuals or entities

    New Auto-Interp
    Head Attr Weights
    0:0.05
    1:0.07
    2:0.03
    3:0.03
    4:0.04
    5:0.40
    6:0.03
    7:0.02
    8:0.05
    9:0.09
    10:0.09
    11:0.04
    Negative Logits
    eur
    -1.67
    mong
    -1.66
     Alger
    -1.63
     antiquity
    -1.61
    slaught
    -1.58
    iod
    -1.57
     criminality
    -1.52
    -1.51
    nesty
    -1.50
     sinners
    -1.47
    POSITIVE LOGITS
    */(
    2.08
     sidx
    1.92
    ['
    1.91
    ":[{"
    1.88
    rolet
    1.87
    Reviewer
    1.75
     Motors
    1.73
    wcsstore
    1.71
     UCHIJ
    1.70
     Twins
    1.70
    Act Density 0.040%

    No Known Activations