INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Hist
    -0.07
    ANGE
    -0.07
    ittle
    -0.06
                         
    -0.06
    ancy
    -0.06
     Hist
    -0.06
     Ukraine
    -0.06
     Friday
    -0.06
    inese
    -0.06
     Focus
    -0.06
    POSITIVE LOGITS
     قتل
    0.07
    评价
    0.06
     psycopg
    0.06
    0.06
    (reinterpret
    0.06
     له
    0.06
     yOffset
    0.06
     offset
    0.06
     thermostat
    0.06
    jf
    0.06
    Act Density 0.057%

    No Known Activations