INDEX
    Explanations

    references to physical actions and spatial descriptions

    New Auto-Interp
    Negative Logits
    ise
    -0.15
    mil
    -0.15
     Fol
    -0.14
     Dow
    -0.14
     downstream
    -0.14
    [
    -0.14
    ï
    -0.14
     punct
    -0.14
    ersh
    -0.14
    ko
    -0.13
    POSITIVE LOGITS
     reaching
    0.18
    ä¸Ī
    0.17
    jedn
    0.16
    .extent
    0.16
    Reach
    0.16
    _touch
    0.15
    asto
    0.15
    æº
    0.15
    esine
    0.14
     chiá»ģu
    0.14
    Act Density 0.069%

    No Known Activations