INDEX
    Explanations

    phrases related to planning and future actions

    New Auto-Interp
    Negative Logits
    šov
    -0.17
    pv
    -0.16
    eniz
    -0.15
    utherland
    -0.14
    ussia
    -0.14
    dn
    -0.14
    udes
    -0.14
    agem
    -0.14
    ÅĦst
    -0.14
    rega
    -0.14
    POSITIVE LOGITS
     next
    0.73
    next
    0.60
     future
    0.53
    _next
    0.52
    (next
    0.51
    .next
    0.50
    	next
    0.49
    -next
    0.48
     Next
    0.47
     näch
    0.47
    Act Density 0.390%

    No Known Activations