INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    u
    -0.07
    =t
    -0.06
    pectrum
    -0.06
    -0.06
     και
    -0.06
    Responsive
    -0.06
    -0.06
     mand
    -0.06
    =v
    -0.06
    	num
    -0.06
    POSITIVE LOGITS
     burglary
    0.07
     Guil
    0.07
     agreement
    0.06
     worldly
    0.06
     آب
    0.06
     نب
    0.06
    นะ
    0.06
     sleep
    0.06
    .labelX
    0.06
     Sleep
    0.06
    Act Density 0.016%

    No Known Activations