INDEX
    Explanations

    references to data structures and parameters in code

    New Auto-Interp
    Negative Logits
    ']):
    -1.24
    '):
    
    -1.19
    ']:
    -1.17
    '])->
    -1.13
    '],
    
    -1.11
    ()){
    
    -1.09
    '){
    
    -1.08
    ')):
    -1.08
    "]:
    -1.07
    '}>
    -1.07
    POSITIVE LOGITS
    1.28
    ↵↵↵
    1.04
    ↵↵
    0.80
     purpoſe
    0.69
    </blockquote>
    0.68
    ↵↵↵↵↵
    0.67
     himſelf
    0.67
    ↵↵↵↵
    0.66
    <eos>
    0.63
     Jefus
    0.60
    Act Density 0.248%

    No Known Activations