INDEX
    Explanations

    dates written in a specific format

    references to events and entities, particularly names and dates

    New Auto-Interp
    Negative Logits
     Ples
    -0.84
     Streets
    -0.84
     Turtles
    -0.84
     Tos
    -0.82
     Spo
    -0.79
    Spot
    -0.77
     pim
    -0.77
     Pom
    -0.75
     Ts
    -0.74
     Kis
    -0.73
    POSITIVE LOGITS
    221
    1.13
     221
    1.02
    arn
    1.00
    220
    0.99
    211
    0.98
    21
    0.98
    223
    0.97
    jab
    0.95
     220
    0.93
    HER
    0.92
    Act Density 0.550%

    No Known Activations