INDEX
    Explanations

    locations or directional words (e.g., "across," "from") in a sentence

    the word "across" in various contexts

    New Auto-Interp
    Negative Logits
    nery
    -0.76
    etic
    -0.70
    FORE
    -0.69
    spot
    -0.63
    rw
    -0.61
    ENC
    -0.58
    nce
    -0.57
    Parents
    -0.57
    getic
    -0.57
     HELP
    -0.57
    POSITIVE LOGITS
    roads
    0.83
    atform
    0.77
     rooft
    0.74
    side
    0.72
    hang
    0.71
    flow
    0.70
    ĸļ
    0.67
    urst
    0.67
     paths
    0.66
    halla
    0.66
    Act Density 0.027%

    No Known Activations