INDEX
    Explanations

    references to different time points and changes in circumstances

    New Auto-Interp
    Negative Logits
    alim
    -0.15
    ish
    -0.15
    wap
    -0.15
    eren
    -0.14
    ats
    -0.14
     beginnings
    -0.14
     discourse
    -0.14
    egin
    -0.14
    uma
    -0.14
     this
    -0.14
    POSITIVE LOGITS
     around
    0.45
     round
    0.40
    Around
    0.40
    around
    0.40
     Around
    0.39
    -around
    0.38
    round
    0.36
    -round
    0.33
    ROUND
    0.29
     autour
    0.29
    Act Density 0.041%

    No Known Activations