INDEX
    Explanations

    references to directional movement and changes in position or orientation

    New Auto-Interp
    Negative Logits
    ">—
    -0.38
    saraba
    -0.35
    WriteBarrier
    -0.33
    imédia
    -0.32
    rimônio
    -0.31
    inWeight
    -0.31
    Gemeinden
    -0.30
    rlrl
    -0.28
    utilisons
    -0.28
    cookieParser
    -0.28
    POSITIVE LOGITS
     direction
    3.70
     Direction
    3.16
    direction
    3.16
     directions
    3.08
     DIRECTION
    3.02
    Direction
    2.92
    方向
    2.64
    DIRECTION
    2.61
    directions
    2.59
     Directions
    2.58
    Act Density 1.270%

    No Known Activations