INDEX
    Explanations

    phrases related to returning or reflecting on past events

    New Auto-Interp
    Negative Logits
    orthodox
    -0.77
    imon
    -0.71
    icons
    -0.67
    shown
    -0.67
    ruff
    -0.67
    olor
    -0.66
    stant
    -0.65
    orst
    -0.65
    ording
    -0.65
    tops
    -0.64
    POSITIVE LOGITS
     undone
    1.11
     forth
    0.96
     ashore
    0.92
     hither
    0.88
     into
    0.81
     apart
    0.81
     roaring
    0.81
     closer
    0.80
    leon
    0.78
     out
    0.77
    Act Density 1.061%

    No Known Activations