INDEX
    Explanations

    indicates a temporal sequence

    New Auto-Interp
    Negative Logits
     
    0.32
    0.28
    ۔
    0.28
     يقول
    0.27
     WHEN
    0.27
     badań
    0.26
     ngunit
    0.26
     ketika
    0.26
     sırasında
    0.26
     إذا
    0.25
    POSITIVE LOGITS
    being
    0.28
     being
    0.26
    cura
    0.25
    cuss
    0.24
    curr
    0.24
    chy
    0.24
    dept
    0.24
    0.24
    ório
    0.24
    t
    0.23
    Act Density 0.102%

    No Known Activations