INDEX
    Explanations

    code snippets from different programming languages and various types of punctuation, particularly apostrophes

    New Auto-Interp
    Negative Logits
     dévelo
    -0.72
     normaux
    -0.68
     Vikipedi
    -0.66
     eût
    -0.65
     auroit
    -0.64
     démocr
    -0.63
    InSection
    -0.63
     giusti
    -0.63
     cucchiai
    -0.62
     zelve
    -0.62
    POSITIVE LOGITS
     ")");
    0.65
    '");
    0.65
    ()]);
    0.64
    ));
    0.60
    })();
    
    0.60
    }');
    0.59
    ]");
    0.59
    ]');
    0.57
    )");
    0.57
    )));
    0.57
    Act Density 0.443%

    No Known Activations