INDEX
    Explanations

    specific Polish words and phrases related to personal experiences and emotions

    New Auto-Interp
    Negative Logits
    kus
    -0.19
    kil
    -0.18
    kok
    -0.16
    okable
    -0.15
    buch
    -0.15
    è¡£
    -0.15
    oÄį
    -0.14
    νοÏį
    -0.14
    Ī
    -0.14
    ãĤ»ãĥ³
    -0.14
    POSITIVE LOGITS
    ж
    0.55
    ž
    0.53
    ż
    0.52
    жа
    0.43
    ży
    0.42
    жи
    0.41
    Ðĸ
    0.41
    же
    0.41
    ži
    0.41
    žÃŃ
    0.41
    Act Density 0.024%

    No Known Activations