INDEX
    Explanations

    references to locations or addresses

    New Auto-Interp
    Negative Logits
    <bos>
    -0.78
    public
    -0.69
    </tbody>
    -0.64
     establish
    -0.63
    /*
    -0.63
     activate
    -0.63
     overcome
    -0.60
    //
    -0.60
     ensure
    -0.60
     improve
    -0.60
    POSITIVE LOGITS
     k
    1.78
    k
    1.74
     accla
    1.52
     affor
    1.51
     unden
    1.50
     unlaw
    1.46
     bourgeo
    1.45
     embodi
    1.45
     emphat
    1.40
     guarante
    1.37
    Act Density 0.166%

    No Known Activations