INDEX
    Explanations

    mentions of events or actions that started or initiated something new

    instances of the word "started" in various contexts

    New Auto-Interp
    Negative Logits
    cit
    -0.81
    etry
    -0.75
    acho
    -0.73
    ighth
    -0.73
    ethy
    -0.73
    âĨij
    -0.72
    alted
    -0.71
    ugs
    -0.69
    ses
    -0.69
    ingly
    -0.69
    POSITIVE LOGITS
     anew
    0.93
     raining
    0.77
     PRESS
    0.76
    OCK
    0.75
    ŃĶ
    0.74
     airing
    0.73
     fuss
    0.71
     circulating
    0.71
     behaving
    0.71
     dating
    0.67
    Act Density 0.056%

    No Known Activations