INDEX
    Explanations

    phrases related to justification or reasoning

    New Auto-Interp
    Negative Logits
    gae
    -0.67
    DX
    -0.63
    Latest
    -0.61
     strives
    -0.60
    */(
    -0.59
    Def
    -0.59
     prepares
    -0.58
    inventoryQuantity
    -0.58
     itch
    -0.57
    iatus
    -0.56
    POSITIVE LOGITS
    would
    1.30
     wouldn
    1.22
     would
    1.15
     Would
    1.12
     Wouldn
    1.04
    Had
    0.90
     someday
    0.89
    'd
    0.88
    Would
    0.86
     hadn
    0.84
    Act Density 1.928%

    No Known Activations