INDEX
    Explanations

    coding syntax and structure indicators

    New Auto-Interp
    Negative Logits
    172
    -0.20
     Feb
    -0.19
     Sep
    -0.19
    173
    -0.18
    128
    -0.18
     Kin
    -0.18
    178
    -0.17
    177
    -0.16
    512
    -0.16
    222
    -0.15
    POSITIVE LOGITS
                       
    0.46
    105
    0.35
    106
    0.23
    Ľ
    0.21
    åįģäºĶ
    0.20
    15
    0.20
     fifteen
    0.20
    	               
    0.20
    		           
    0.19
                        ↵                    ↵
    0.19
    Act Density 0.036%

    No Known Activations