INDEX
    Explanations

    physical actions or movements

    New Auto-Interp
    Negative Logits
    shire
    -0.63
     distances
    -0.61
     attribution
    -0.59
     Deal
    -0.59
     details
    -0.57
     falsehood
    -0.57
     mirrors
    -0.57
     EDITION
    -0.57
     Ended
    -0.55
     Belfast
    -0.54
    POSITIVE LOGITS
    formance
    1.34
    cking
    1.32
    gging
    1.28
    eping
    1.27
    eking
    1.26
    eps
    1.24
    pperc
    1.19
    eling
    1.17
    ppy
    1.14
    aking
    1.12
    Act Density 0.020%

    No Known Activations