INDEX
    Explanations

    This neuron responds to words indicating rearward or backward direction or positioning.

    New Auto-Interp
    Negative Logits
    /index
    -0.06
     deviation
    -0.06
    .cy
    -0.06
     Rock
    -0.06
     Ty
    -0.06
     Likes
    -0.06
    ありがとうござ
    -0.06
     surf
    -0.06
     Thing
    -0.06
     Story
    -0.06
    POSITIVE LOGITS
     rear
    0.10
     Rear
    0.09
    _td
    0.07
    ilinx
    0.07
     fot
    0.07
     geri
    0.07
     rim
    0.07
     elegant
    0.07
     safeg
    0.07
    0.06
    Act Density 0.014%

    No Known Activations