INDEX
    Explanations

    information or steps in a set of instructions

    New Auto-Interp
    Negative Logits
    <bos>
    -2.41
    if
    -0.72
    public
    -0.71
     assume
    -0.67
     if
    -0.67
    -0.64
    יע
    -0.63
     look
    -0.63
    case
    -0.62
     be
    -0.62
    POSITIVE LOGITS
     emphat
    1.89
     Juf
    1.83
     guarante
    1.74
     Augu
    1.73
     maneu
    1.71
     aen
    1.70
     accla
    1.69
     fta
    1.67
     inev
    1.67
     squa
    1.66
    Act Density 0.736%

    No Known Activations