INDEX
    Explanations

    occurrences of significant events or actions, particularly those involving changes or decisions in various contexts

    New Auto-Interp
    Negative Logits
     tend
    -0.16
     tended
    -0.15
    yal
    -0.15
    iated
    -0.14
    /
    -0.14
    .
    -0.13
     Tells
    -0.13
     deserve
    -0.13
     contain
    -0.13
     consec
    -0.13
    POSITIVE LOGITS
     follows
    0.36
     follow
    0.26
     Follow
    0.24
     marks
    0.23
    Follow
    0.23
    follow
    0.22
     coinc
    0.22
     comes
    0.22
    .follow
    0.21
     FOLLOW
    0.21
    Act Density 0.158%

    No Known Activations