INDEX
    Explanations

    places or destinations

    instances of the word "to"

    New Auto-Interp
    Negative Logits
    selves
    -0.91
    etheless
    -0.77
    terday
    -0.76
    fortunately
    -0.71
    angered
    -0.68
    enance
    -0.67
    worthiness
    -0.65
    manship
    -0.65
    ection
    -0.63
     entit
    -0.62
    POSITIVE LOGITS
    pless
    1.08
     extremes
    1.07
     jail
    1.04
     bed
    1.03
     sleep
    1.00
     lengths
    0.93
     prison
    0.85
     war
    0.84
     bat
    0.84
     hell
    0.81
    Act Density 0.077%

    No Known Activations