INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    _
    0.43
     isang
    0.38
    ل
    0.38
     izvor
    0.38
     costru
    0.36
     crianças
    0.36
     edhe
    0.35
     stesse
    0.35
     pamoja
    0.35
     suatu
    0.35
    POSITIVE LOGITS
    t
    0.54
    m
    0.53
     (
    0.51
    g
    0.45
    i
    0.42
    c
    0.39
    w
    0.38
    x
    0.38
    b
    0.38
    p
    0.38
    Act Density 0.212%

    No Known Activations