INDEX
    Explanations

    dates formatted as month followed by a number, such as "June 3rd"

    dates and temporal references in the text

    New Auto-Interp
    Negative Logits
     hygiene
    -0.69
    igans
    -0.61
     CARD
    -0.59
    illas
    -0.57
    ãĥ¼ãĥĨ
    -0.57
    udging
    -0.56
    igham
    -0.56
     interven
    -0.56
    idences
    -0.55
    ãĥķãĤ¡
    -0.54
    POSITIVE LOGITS
    th
    1.01
    rd
    0.99
    TH
    0.79
    nd
    0.78
    â̳
    0.72
    ths
    0.71
    2200
    0.70
    â̲
    0.70
    stice
    0.69
    itia
    0.69
    Act Density 0.075%

    No Known Activations