INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     oryginal
    -0.98
     świą
    -0.97
     Č
    -0.93
     całym
    -0.90
     zaby
    -0.87
     postponed
    -0.83
    Č
    -0.81
     А
    -0.79
     św
    -0.79
     niyang
    -0.79
    POSITIVE LOGITS
     Polish
    1.16
     Poland
    1.05
    🇵
    1.04
    Wis
    0.98
    Polish
    0.96
     Chopin
    0.95
     fotografii
    0.93
    ienki
    0.92
     komer
    0.91
    Poland
    0.90
    Act Density 0.080%

    No Known Activations