INDEX
    Explanations

    conjunctions that introduce contrasting information

    New Auto-Interp
    Negative Logits
    <bos>
    -2.51
    -1.20
    
    
    -1.06
    <?
    -0.99
    /**
    -0.89
    /*
    -0.88
    <?
    
    -0.84
    /***
    
    -0.81
    /*++
    -0.80
    public
    -0.73
    POSITIVE LOGITS
     maneu
    2.12
     affor
    1.94
     impra
    1.90
     accla
    1.88
     increa
    1.83
     stockholm
    1.79
     shenan
    1.76
     scrat
    1.75
     reluct
    1.74
     disagre
    1.74
    Act Density 0.315%

    No Known Activations