INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ponce
    -0.07
    .dead
    -0.07
     المعلومات
    -0.07
    Boot
    -0.06
    _GATE
    -0.06
     byste
    -0.06
    middleware
    -0.06
    dateTime
    -0.06
    eddar
    -0.06
     sexdate
    -0.06
    POSITIVE LOGITS
     parcels
    0.09
     loại
    0.07
     урож
    0.07
     slice
    0.06
    (slot
    0.06
    ÔNG
    0.06
     conseils
    0.06
     ceiling
    0.06
     plot
    0.06
    čně
    0.06
    Act Density 0.003%

    No Known Activations