INDEX
    Explanations

    items listed in sequence

    conjunctions and phrases indicating continuity or sequential actions

    New Auto-Interp
    Negative Logits
    Ĥª
    -0.68
    MET
    -0.65
    ox
    -0.63
    embed
    -0.63
    "],"
    -0.62
    roll
    -0.62
    ocations
    -0.60
    mx
    -0.59
     COVER
    -0.58
    iaz
    -0.57
    POSITIVE LOGITS
     third
    2.15
     fourth
    2.08
    Third
    2.05
    Fourth
    1.94
     second
    1.87
    third
    1.85
     fifth
    1.84
    fourth
    1.82
     sixth
    1.73
     seventh
    1.71
    Act Density 0.186%

    No Known Activations