INDEX
    Explanations

    words related to political figures and events

    New Auto-Interp
    Negative Logits
    IUM
    -0.80
    lished
    -0.58
     pressures
    -0.57
    ¶ħ
    -0.57
     popcorn
    -0.56
     hazards
    -0.55
     surface
    -0.55
     calibrated
    -0.55
     Downs
    -0.54
    Ĥİ
    -0.54
    POSITIVE LOGITS
    Allah
    0.82
    wife
    0.80
    hood
    0.78
    daughter
    0.77
    abet
    0.73
    brother
    0.73
    backer
    0.71
    chief
    0.71
    aiden
    0.70
    son
    0.70
    Act Density 0.069%

    No Known Activations