INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     unborn
    -0.07
    python
    -0.07
     Salon
    -0.06
     Lynch
    -0.06
    Número
    -0.06
     Tian
    -0.06
     substitutions
    -0.06
    (import
    -0.06
     Java
    -0.06
    -0.06
    POSITIVE LOGITS
    AXB
    0.07
    _probe
    0.07
     opět
    0.07
     вида
    0.06
    modifiers
    0.06
     sudah
    0.06
     boa
    0.06
    esel
    0.06
     δε
    0.06
    _vertical
    0.06
    Act Density 0.004%

    No Known Activations