INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    いる
    1.17
    ون
    1.12
    いた
    1.09
    ors
    1.05
     आठ
    1.01
    وان
    0.99
    א
    0.96
    ים
    0.92
    вое
    0.91
     उपलब्धि
    0.91
    POSITIVE LOGITS
    to
    1.15
    '
    0.98
     Relative
    0.89
    ต์
    0.87
    p
    0.86
    is
    0.84
     relativ
    0.84
     jim
    0.80
     to
    0.80
    orak
    0.79
    Act Density 0.093%

    No Known Activations