INDEX
    Explanations

    dates or events in reported news articles

    punctuation and contextual markers within the text

    New Auto-Interp
    Negative Logits
     pound
    -0.69
     ethn
    -0.66
     inward
    -0.62
     iod
    -0.61
     individually
    -0.60
     ambassadors
    -0.59
    uary
    -0.59
     religiously
    -0.58
    edom
    -0.58
     unfamiliar
    -0.57
    POSITIVE LOGITS
     Logged
    1.27
     SEE
    1.27
    Reviewer
    1.19
     Photo
    1.11
     Posted
    1.05
     Loading
    0.87
     View
    0.87
     Listen
    0.87
    ļéĨĴ
    0.85
    <|endoftext|>
    0.84
    Act Density 0.144%

    No Known Activations