INDEX
    Explanations

    phrases or names related to political figures and events

    mentions of specific individuals, particularly political figures

    New Auto-Interp
    Negative Logits
    istical
    -0.75
    imates
    -0.73
    ORE
    -0.65
    uate
    -0.65
     Grateful
    -0.64
     Sorcerer
    -0.63
     Metatron
    -0.62
     Mirror
    -0.62
     CoC
    -0.61
    redit
    -0.61
    POSITIVE LOGITS
    lette
    0.97
     Rouse
    0.93
     Rousse
    0.91
    stal
    0.88
    ff
    0.87
    lin
    0.80
    LB
    0.77
    cia
    0.77
    ĸļ
    0.76
    utics
    0.76
    Act Density 0.009%

    No Known Activations