INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ediği
    -0.06
    ديث
    -0.06
    ALCHEMY
    -0.06
     hue
    -0.06
     Verfüg
    -0.06
    ểm
    -0.06
    (pred
    -0.06
    usal
    -0.06
     Bedroom
    -0.06
    _visual
    -0.05
    POSITIVE LOGITS
     борь
    0.07
     bureaucratic
    0.07
    .pad
    0.06
     (...)
    0.06
    .bootstrap
    0.06
     бл
    0.06
     session
    0.06
    BOOLE
    0.06
    _beg
    0.06
    .toLowerCase
    0.06
    Act Density 0.042%

    No Known Activations