INDEX
    Explanations

    instances of the verb "be" in various forms

    New Auto-Interp
    Negative Logits
     being
    -0.35
     Being
    -0.34
    Being
    -0.33
    being
    -0.33
    -being
    -0.29
    被
    -0.27
     被
    -0.24
     already
    -0.24
    now
    -0.24
    never
    -0.24
    POSITIVE LOGITS
    friend
    0.36
     able
    0.33
    COME
    0.32
    fall
    0.31
    get
    0.28
    fit
    0.27
    have
    0.26
    que
    0.26
    eline
    0.24
    stride
    0.22
    Act Density 0.293%

    No Known Activations