INDEX
    Explanations

    phrases indicating a consistent or habitual action

    New Auto-Interp
    Negative Logits
     often
    -0.26
     altogether
    -0.23
    often
    -0.22
     artık
    -0.21
     oft
    -0.20
     hardly
    -0.19
     seldom
    -0.19
    å¹¶ä¸į
    -0.19
     neither
    -0.19
     actually
    -0.19
    POSITIVE LOGITS
     been
    0.22
    cky
    0.21
    -on
    0.19
    -available
    0.19
    -ending
    0.18
    -present
    0.17
    以æĿ¥
    0.17
     Been
    0.17
     wondered
    0.17
    ready
    0.17
    Act Density 0.093%

    No Known Activations