INDEX
    Explanations

    proper names related to a particular novel being analyzed

    New Auto-Interp
    Negative Logits
    public
    -0.81
                
    -0.80
     have
    -0.79
     engage
    -0.79
     raise
    -0.79
    <bos>
    -0.78
     can
    -0.78
    			
    -0.78
            
    -0.77
    .
    -0.77
    POSITIVE LOGITS
     fta
    2.29
     secon
    2.22
     strick
    2.19
     Len
    2.19
     dispen
    2.19
     aen
    2.19
     affor
    2.19
     effe
    2.17
     fuf
    2.16
     squa
    2.15
    Act Density 0.152%

    No Known Activations