INDEX
    Explanations

    references to programming concepts and terminology

    New Auto-Interp
    Negative Logits
    utenants
    -0.72
    ه
    -0.69
    letoe
    -0.67
    e
    -0.65
    phrag
    -0.62
    Slf
    -0.60
    dew
    -0.59
    1
    -0.58
    2
    -0.58
    3
    -0.56
    POSITIVE LOGITS
     myſelf
    1.04
     himſelf
    1.03
    ſelf
    1.02
     itſelf
    1.01
     themſelves
    0.99
    ſelves
    0.94
     paſſ
    0.94
     enfans
    0.88
     ſmall
    0.86
    ;">
    
    0.86
    Act Density 1.782%

    No Known Activations