INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    al
    1.34
    g
    1.22
    er
    1.20
    -
    1.18
     καθώς
    1.17
    ,
    1.16
    /
    1.15
    kan
    1.14
    arın
    1.13
     که
    1.12
    POSITIVE LOGITS
     It
    1.19
    ו
    1.13
     xml
    1.06
     Puoi
    1.05
     an
    0.97
     Xml
    0.96
     inzwischen
    0.95
     insgesamt
    0.95
    на
    0.94
     XML
    0.94
    Act Density 0.007%

    No Known Activations