INDEX
    Explanations

    attends to error-related tokens from code segments that handle exceptions or error checking

    New Auto-Interp
    Head Attr Weights
    0:0.02
    1:0.55
    2:0.06
    3:0.02
    4:0.04
    5:0.17
    6:0.05
    7:0.05
    Negative Logits
     myſelf
    -1.70
     itſelf
    -1.66
     Efq
    -1.58
     Theſe
    -1.51
     Monfieur
    -1.48
    ſelf
    -1.44
     ―――――
    -1.43
     pleaſure
    -1.42
     ſeveral
    -1.41
     Jefus
    -1.41
    POSITIVE LOGITS
      
    0.85
    0.84
     (
    0.79
    .
    0.74
     in
    0.71
    <eos>
    0.69
    ,
    0.67
     of
    0.67
        
    0.66
     [
    0.66
    Act Density 0.033%

    No Known Activations