INDEX
    Explanations

    words related to increasing or driving up various factors or actions

    New Auto-Interp
    Negative Logits
    Guard
    -0.69
    eal
    -0.66
    ftime
    -0.66
    inis
    -0.64
    bis
    -0.63
    sm
    -0.62
    til
    -0.62
    idelines
    -0.61
    ynski
    -0.60
    cean
    -0.59
    POSITIVE LOGITS
     aside
    1.06
     away
    0.98
     forth
    0.98
     down
    0.96
     wedge
    0.91
     upwards
    0.90
     onward
    0.90
     upward
    0.89
     up
    0.84
     apart
    0.83
    Act Density 0.144%

    No Known Activations