INDEX
    Explanations

    segments related to software updates and their functionalities

    New Auto-Interp
    Negative Logits
    <bos>
    -0.69
     an
    -0.67
     in
    -0.60
     no
    -0.60
     a
    -0.57
     so
    -0.56
     e
    -0.55
    ur
    -0.55
     too
    -0.55
     for
    -0.55
    POSITIVE LOGITS
     purpoſe
    1.64
     Houſe
    1.58
     houſe
    1.57
     ſtate
    1.55
     Majefty
    1.54
     ſche
    1.50
     itſelf
    1.49
     Anſ
    1.48
     Reſ
    1.47
     myſelf
    1.46
    Act Density 0.056%

    No Known Activations