INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.74
    endpush
    -0.73
     &___
    -0.73
     Paglinawan
    -0.72
    writeFieldEnd
    -0.71
    ########.
    -0.70
    SharedDtor
    -0.68
     Italijanski
    -0.65
     ✭✭
    -0.65
     Wikimedijinoj
    -0.64
    POSITIVE LOGITS
     (
    0.64
    elsif
    0.49
     [
    0.44
     either
    0.43
     Either
    0.43
    elseif
    0.43
     ((
    0.42
     dass
    0.42
    mrow
    0.42
    if
    0.42
    Act Density 0.042%

    No Known Activations