INDEX
    Explanations

    key-value pairs or identifiers

    New Auto-Interp
    Negative Logits
    -0.96
    Ayrıca
    -0.94
    groet
    -0.91
     kaž
    -0.91
     stát
    -0.90
     telas
    -0.90
     Käufer
    -0.90
     tré
    -0.89
     komfort
    -0.87
    digos
    -0.85
    POSITIVE LOGITS
     خودش
    0.96
    還會
    0.91
     самому
    0.85
     €)
    0.85
     itself
    0.82
    ferous
    0.80
     Feste
    0.80
     zaten
    0.79
     bosco
    0.78
     celebre
    0.78
    Act Density 0.033%

    No Known Activations