INDEX
    Explanations

    codes or identifiers

    New Auto-Interp
    Negative Logits
    もっと
    0.35
     پیدا
    0.34
    より
    0.33
    えず
    0.33
    ỡi
    0.32
    0.31
     claramente
    0.31
    ÍC
    0.31
     basically
    0.31
     van
    0.30
    POSITIVE LOGITS
     Bhub
    0.41
     habilidad
    0.38
    strength
    0.38
    נים
    0.36
     কাল
    0.36
    तलब
    0.36
    0.35
    Strength
    0.35
    Boundary
    0.34
    subNav
    0.34
    Act Density 0.012%

    No Known Activations