INDEX
    Explanations

    names of specific individuals, particularly politicians

    mentions of specific individuals, particularly politicians and notable figures

    New Auto-Interp
    Negative Logits
    ters
    -0.78
    phrine
    -0.78
    ching
    -0.77
    ths
    -0.73
    cise
    -0.72
    brance
    -0.71
    omial
    -0.67
    htt
    -0.67
    monds
    -0.67
    yright
    -0.65
    POSITIVE LOGITS
    ozy
    0.93
    inian
    0.86
    olini
    0.83
    kas
    0.77
    anye
    0.76
    schild
    0.75
    anski
    0.75
    lain
    0.73
    ika
    0.72
    inia
    0.71
    Act Density 0.078%

    No Known Activations