INDEX
    Explanations

    references to Bill and Hillary Clinton

    New Auto-Interp
    Negative Logits
    BS
    -0.15
    ward
    -0.15
     Patri
    -0.15
    occo
    -0.14
     Wed
    -0.14
    ầm
    -0.14
    appa
    -0.14
    mino
    -0.13
    weis
    -0.13
    udent
    -0.13
    POSITIVE LOGITS
    istrovstvÃŃ
    0.17
    legg
    0.15
     Ruf
    0.15
    rell
    0.15
    ilde
    0.15
    abad
    0.14
    ÑıÑĤи
    0.14
    jerne
    0.14
    .ov
    0.14
    ersh
    0.14
    Act Density 0.004%

    No Known Activations