INDEX
    Explanations

    adjectives and terms that imply capability or potential

    New Auto-Interp
    Negative Logits
    ed
    -0.27
    ing
    -0.25
    arily
    -0.23
    edb
    -0.21
    ical
    -0.20
    ically
    -0.19
    ese
    -0.17
    emann
    -0.17
    ers
    -0.17
    ede
    -0.17
    POSITIVE LOGITS
    /print
    0.23
    able
    0.20
    atable
    0.20
    /edit
    0.19
    /read
    0.18
    -bodied
    0.17
    options
    0.17
    mente
    0.17
    /un
    0.17
    /use
    0.17
    Act Density 0.159%

    No Known Activations