INDEX
    Explanations

    names of political figures and notable individuals

    New Auto-Interp
    Head Attr Weights
    0:0.01
    1:0.01
    2:0.22
    3:0.07
    4:0.07
    5:0.03
    6:0.03
    7:0.05
    8:0.07
    9:0.03
    10:0.17
    11:0.19
    Negative Logits
     affirmation
    -1.30
    oret
    -1.23
     salute
    -1.18
     reminder
    -1.17
    undown
    -1.16
     delet
    -1.16
     constitu
    -1.15
    lot
    -1.13
     didnt
    -1.13
     reminds
    -1.12
    POSITIVE LOGITS
     whom
    1.39
    [/
    1.38
    SPONSORED
    1.27
    Weather
    1.22
    cific
    1.19
     Sheldon
    1.19
     invading
    1.15
     Mutual
    1.13
    phy
    1.12
     helicop
    1.11
    Act Density 0.289%

    No Known Activations