INDEX
    Explanations

    references to Poland or Polish culture

    New Auto-Interp
    Negative Logits
    orial
    -0.15
    icap
    -0.15
    iles
    -0.15
    esen
    -0.14
    anker
    -0.14
    azine
    -0.14
    editable
    -0.14
    ieval
    -0.14
    ownt
    -0.13
    hdl
    -0.13
    POSITIVE LOGITS
    olu
    0.20
    elik
    0.17
    enta
    0.17
    ych
    0.16
    itical
    0.16
    itics
    0.16
    r
    0.15
    mav
    0.15
    uter
    0.15
    indr
    0.15
    Act Density 0.024%

    No Known Activations