INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    其他
    0.92
     assorted
    0.88
    ചര്യ
    0.86
     اکاؤن
    0.84
     Secara
    0.83
    Jika
    0.83
     그렇게
    0.81
    0.80
    0.79
     అయితే
    0.78
    POSITIVE LOGITS
     as
    1.45
    p
    1.20
     are
    1.16
     in
    1.14
    ية
    1.06
    c
    1.05
    m
    1.05
     was
    1.04
    t
    1.02
    1.01
    Act Density 0.191%

    No Known Activations