INDEX
    Explanations

    established or common ideas

    New Auto-Interp
    Negative Logits
    0.39
    Dz
    0.38
     Primarily
    0.38
    0.37
    異なる
    0.37
    違う
    0.37
     इलेक्ट्रिक
    0.37
     ANZ
    0.37
    نہ
    0.36
    मंत्र
    0.36
    POSITIVE LOGITS
     commonplace
    0.90
     common
    0.70
     comum
    0.69
    常見
    0.67
    常见的
    0.66
     comuns
    0.64
    常见
    0.63
     আগেও
    0.63
     routinely
    0.61
    Already
    0.61
    Act Density 0.659%

    No Known Activations