INDEX
    Explanations

    the presence of specific names or proper nouns

    New Auto-Interp
    Head Attr Weights
    0:0.08
    1:0.08
    2:0.06
    3:0.08
    4:0.09
    5:0.07
    6:0.09
    7:0.08
    8:0.07
    9:0.09
    10:0.06
    11:0.08
    Negative Logits
     lett
    -2.38
     Afric
    -2.29
     Letters
    -2.19
    OPLE
    -2.06
    "],"
    -2.05
     Roses
    -2.04
     ILCS
    -2.01
     ann
    -2.00
    Sources
    -1.97
     Cabinet
    -1.97
    POSITIVE LOGITS
    tracking
    2.54
    ptions
    2.24
    perfect
    2.17
    erest
    2.07
    progress
    2.07
    clair
    2.06
     tame
    2.00
    vant
    2.00
    zech
    2.00
    keley
    2.00
    Act Density 0.000%

    No Known Activations