INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    י
    1.61
    >∕
    1.54
    1.52
    1.50
    на
    1.45
    1.43
     argento
    1.42
     лидер
    1.42
    ü
    1.41
    1.41
    POSITIVE LOGITS
    تیب
    1.34
    はお
    1.28
    стью
    1.27
    0
    1.24
    io
    1.21
    ことがあります
    1.19
    ies
    1.16
    াচার
    1.16
    াস
    1.15
     Judgment
    1.15
    Act Density 0.352%

    No Known Activations