INDEX
    Explanations

    lines of text ending with a specific character as indicated by the colon with high activation values

    sequences of punctuation marks and structured text elements

    New Auto-Interp
    Negative Logits
    Leaks
    -0.78
     ingred
    -0.70
    soDeliveryDate
    -0.68
     hemor
    -0.67
    æ©
    -0.66
     pestic
    -0.65
     principals
    -0.65
    ensibly
    -0.64
     exting
    -0.64
     Palestin
    -0.61
    POSITIVE LOGITS
     âĨij
    0.89
    Show
    0.76
     Adding
    0.69
     Originally
    0.68
     Cly
    0.68
     Interesting
    0.66
    itars
    0.65
    leon
    0.64
    Originally
    0.64
     inherit
    0.63
    Act Density 0.071%

    No Known Activations