INDEX
    Explanations

    lines of code or comments in a programming context

    New Auto-Interp
    Negative Logits
     —,
    -0.78
     Bona
    -0.76
    ation
    -0.72
    nas
    -0.70
    istani
    -0.70
    ona
    -0.68
    Bon
    -0.67
    cillo
    -0.66
    —,
    -0.66
    al
    -0.66
    POSITIVE LOGITS
    ///
    1.66
     ///
    1.33
    ///
    
    1.02
    /////
    0.90
    ///<
    0.84
    ায়
    0.84
    phazard
    0.84
    ţiile
    0.81
    yto
    0.80
    ്‍
    0.79
    Act Density 0.042%

    No Known Activations