INDEX
    Explanations

    syntactical structures and code-like syntax elements

    New Auto-Interp
    Negative Logits
    8
    -0.32
    åħ«
    -0.29
     åħ«
    -0.28
     eight
    -0.27
     August
    -0.25
    eight
    -0.24
     Aug
    -0.23
     Eight
    -0.23
    Eight
    -0.23
    08
    -0.23
    POSITIVE LOGITS
                       
    0.29
                   
    0.28
    107
    0.21
    105
    0.20
    				
    0.19
    	               
    0.18
     Seven
    0.17
    			
    0.16
    ↵                    ↵
    0.16
     Juli
    0.16
    Act Density 0.030%

    No Known Activations