INDEX
    Explanations

    the symbol '-' indicating lists or negative items

    New Auto-Interp
    Negative Logits
    <bos>
    -2.27
    -0.77
     implement
    -0.61
    
    
    -0.60
     get
    -0.60
     butterknife
    -0.60
    ,
    -0.59
    <eos>
    -0.59
     in
    -0.59
     put
    -0.59
    POSITIVE LOGITS
     wien
    1.74
     affor
    1.73
     lele
    1.68
     accla
    1.66
     volunte
    1.64
     coö
    1.64
     emphat
    1.63
     unlaw
    1.63
     maneu
    1.63
     increa
    1.60
    Act Density 0.105%

    No Known Activations