INDEX
    Explanations

    programming syntax and structures

    New Auto-Interp
    Negative Logits
    ngo
    -0.18
    Twenty
    -0.18
     Twenty
    -0.17
     nineteen
    -0.17
    33
    -0.17
    332
    -0.17
     twenty
    -0.17
    34
    -0.16
    19
    -0.16
    twenty
    -0.16
    POSITIVE LOGITS
                      
    0.41
                     
    0.38
                       
    0.32
                    
    0.30
    110
    0.26
    	               
    0.26
     				
    0.23
    				
    0.23
    	                
    0.23
    116
    0.22
    Act Density 0.010%

    No Known Activations