INDEX
    Explanations

    conditional statements within programming code, and potentially temporal references in regular text.

    New Auto-Interp
    Negative Logits
     purpoſe
    -1.37
    ſelf
    -1.37
     Majefty
    -1.36
     themſelves
    -1.34
    ^(@)
    -1.34
     Efq
    -1.33
     whoſe
    -1.31
     myſelf
    -1.29
     becauſe
    -1.29
     himſelf
    -1.23
    POSITIVE LOGITS
    <bos>
    1.38
    1.30
    ↵↵
    1.11
    "
    0.98
    e
    0.95
    a
    0.94
    '
    0.93
    <eos>
    0.90
    i
    0.90
    n
    0.90
    Act Density 2.746%

    No Known Activations