INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    د
    1.68
    d
    1.64
    y
    1.57
    س
    1.56
    i
    1.27
    h
    1.26
    in
    1.20
    s
    1.17
    ح
    1.16
    g
    1.07
    POSITIVE LOGITS
    н
    1.45
    нду
    1.27
    nelle
    1.19
    あります
    1.17
    νου
    1.13
    ν
    1.13
    ли
    1.12
    ない
    1.08
    νά
    1.05
    law
    1.03
    Act Density 0.263%

    No Known Activations