INDEX
    Explanations

    phrases indicating future actions or developments

    would lead to future actions

    New Auto-Interp
    Negative Logits
    -0.44
    thodox
    -0.44
    saraba
    -0.44
     Tang
    -0.43
     Marginal
    -0.43
     Pits
    -0.42
    TabStop
    -0.42
     Complex
    -0.42
     Toll
    -0.42
    ContextHolder
    -0.41
    POSITIVE LOGITS
    AndEndTag
    0.57
    writeFieldEnd
    0.56
     ""],
    0.47
    Diwedd
    0.46
     plomo
    0.42
     amizade
    0.41
    setof
    0.40
     später
    0.40
    Citiți
    0.38
    żesz
    0.38
    Act Density 0.391%

    No Known Activations