INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
    <bos>
    -1.63
     springfox
    -0.84
    -0.73
     expand
    -0.70
     establish
    -0.70
    <?
    
    -0.69
     colspan
    -0.68
    
    
    -0.68
     engage
    -0.67
    /**
    -0.66
    POSITIVE LOGITS
     accla
    1.68
     véhic
    1.64
     affor
    1.63
     délib
    1.60
     effe
    1.60
    de
    1.59
     mef
    1.56
     wien
    1.56
     maneu
    1.56
     stockholm
    1.55
    Act Density 0.124%

    No Known Activations