INDEX
    Explanations

    the presence of a specific token structure, primarily a recurring pattern in the text

    New Auto-Interp
    Negative Logits
     Kislyak
    -0.70
    verett
    -0.67
    mble
    -0.63
    andum
    -0.63
     Greenwald
    -0.62
     Qiao
    -0.62
    aukee
    -0.62
     Canaver
    -0.61
     Cheong
    -0.59
    TIME
    -0.58
    POSITIVE LOGITS
    ouse
    1.10
    ulhu
    0.93
    orse
    0.88
    rift
    0.87
    ttp
    0.86
    iop
    0.85
    some
    0.85
    orne
    0.83
    orst
    0.82
    shire
    0.82
    Act Density 0.005%

    No Known Activations