INDEX
    Explanations

    adverbs that describe manner or frequency of actions

    New Auto-Interp
    Negative Logits
    al
    -0.81
    p
    -0.80
    l
    -0.77
    k
    -0.76
    b
    -0.74
    d
    -0.73
    r
    -0.73
    es
    -0.72
    an
    -0.72
    z
    -0.72
    POSITIVE LOGITS
    sively
    1.54
    ently
    1.50
    denly
    1.46
    ificantly
    1.45
    ALLY
    1.43
    xically
    1.42
    atically
    1.40
    aneously
    1.40
    cerely
    1.38
    tically
    1.37
    Act Density 0.661%

    No Known Activations