INDEX
    Explanations

    punctuation marks and function indicators in code snippets

    New Auto-Interp
    Negative Logits
    ahren
    -0.16
    ehler
    -0.15
    izi
    -0.15
    frog
    -0.15
    144
    -0.15
    apes
    -0.15
    cul
    -0.14
     Kiev
    -0.14
    ixer
    -0.14
    enco
    -0.14
    POSITIVE LOGITS
                        
    0.21
                    
    0.18
                         
    0.15
     Gregory
    0.15
     Bakan
    0.14
    fragistics
    0.14
    ган
    0.14
    -placeholder
    0.14
    peats
    0.14
    ordan
    0.14
    Act Density 0.045%

    No Known Activations