INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Compare
    -0.07
    .Service
    -0.07
     dinner
    -0.07
     atomic
    -0.07
    ܢ
    -0.07
    _exit
    -0.07
     PATCH
    -0.07
    discord
    -0.07
    (expression
    -0.06
    Hat
    -0.06
    POSITIVE LOGITS
    0.07
     deleg
    0.07
    CellStyle
    0.07
    ³
    0.07
     wg
    0.07
     wartości
    0.07
    0.07
    0.07
    0.06
    ++;
    0.06
    Act Density 0.012%

    No Known Activations