INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     paired
    -0.06
     palabra
    -0.06
     ambigu
    -0.06
     ------------------------------------------------
    -0.06
    province
    -0.06
    平均
    -0.06
     rgb
    -0.06
     علت
    -0.06
    ãn
    -0.06
    (peer
    -0.06
    POSITIVE LOGITS
     inertia
    0.13
    0.07
    IA
    0.07
    ertia
    0.07
    -ID
    0.07
    ��
    0.06
     CONSEQUENTIAL
    0.06
     BODY
    0.06
    ->{$
    0.06
    0.06
    Act Density 0.001%

    No Known Activations