INDEX
    Explanations

    numerical values associated with specific codes or identifiers

    New Auto-Interp
    Negative Logits
    <em>
    -0.82
    </strong>
    -0.80
    -0.75
    </em>
    -0.71
      
    -0.70
    <eos>
    -0.70
    ...
    -0.70
    -0.69
     “
    -0.65
     é
    -0.60
    POSITIVE LOGITS
    		
    1.38
    	
    1.35
    			
    1.33
     myſelf
    1.32
    					
    1.28
     ―――――
    1.27
    				
    1.26
    						
    1.25
    							
    1.24
     Monfieur
    1.24
    Act Density 0.301%

    No Known Activations