INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     ਅਤੇ
    0.40
     ۽
    0.36
     Europäischen
    0.34
     និង
    0.33
     Dieser
    0.33
     nimmt
    0.33
    éssel
    0.32
     Diese
    0.32
    wili
    0.31
     ۋە
    0.31
    POSITIVE LOGITS
     chỉ
    0.49
     a
    0.48
     только
    0.48
     as
    0.46
     only
    0.45
     in
    0.43
    го
    0.39
     hanya
    0.38
     not
    0.37
     sadece
    0.37
    Act Density 0.115%

    No Known Activations