INDEX
    Explanations

    subjects performing actions

    New Auto-Interp
    Negative Logits
    上述
    0.55
    Effective
    0.55
    effective
    0.50
    Percent
    0.50
    Overlap
    0.48
     pernyataan
    0.47
     suelen
    0.46
    必ず
    0.46
    際に
    0.46
    Typically
    0.46
    POSITIVE LOGITS
     began
    1.08
     went
    1.02
     became
    0.99
     knew
    0.97
     laughed
    0.93
     took
    0.92
     awoke
    0.85
     panicked
    0.85
     lasted
    0.85
     hurriedly
    0.85
    Act Density 0.106%

    No Known Activations