INDEX
    Explanations

    Polish words associated with social and collective themes

    New Auto-Interp
    Negative Logits
     overl
    -0.16
    vester
    -0.16
     besides
    -0.15
    iky
    -0.15
     Class
    -0.15
    erv
    -0.15
    lav
    -0.14
    GY
    -0.14
    andle
    -0.14
     Bav
    -0.14
    POSITIVE LOGITS
    że
    0.21
     nie
    0.21
    ujÄħ
    0.21
    jÄĻ
    0.20
    ÅĤ
    0.20
    acz
    0.20
    jÄħ
    0.20
    ów
    0.19
    ż
    0.19
     pow
    0.19
    Act Density 0.315%

    No Known Activations