INDEX
    Explanations

    he, she, names, followed by action or feeling

    New Auto-Interp
    Negative Logits
     Apparently
    0.93
    apparently
    0.93
     apparently
    0.89
     seeming
    0.89
    Apparently
    0.87
     apparent
    0.86
     meille
    0.79
    我们也
    0.77
    居然
    0.76
     craziness
    0.76
    POSITIVE LOGITS
     knew
    1.20
     feel
    1.19
     feels
    1.18
     watched
    1.14
     knows
    1.13
     জানে
    1.11
    feel
    1.07
     imagines
    1.05
    感觉到
    1.02
     siente
    1.01
    Act Density 0.101%

    No Known Activations