INDEX
    Explanations

    names of politicians, locations, and legal terms related to individuals

    New Auto-Interp
    Negative Logits
    <bos>
    -2.68
    <?
    -0.96
    
    
    -0.95
    /***
    
    -0.93
    -0.80
    <?
    
    -0.79
    /**
    -0.79
    //---
    -0.71
     endow
    -0.64
     abolish
    -0.62
    POSITIVE LOGITS
     Rep
    1.14
    Rep
    1.03
     Reps
    0.99
     jawa
    0.95
     rep
    0.95
     REP
    0.94
    rep
    0.93
     reps
    0.90
    REP
    0.85
    Reps
    0.83
    Act Density 0.108%

    No Known Activations