INDEX
    Explanations

    This neuron activates on occurrences of the word “joint” (or “joints”).

    New Auto-Interp
    Negative Logits
    .bitmap
    -0.07
    ">$
    -0.06
     Burke
    -0.06
    ucle
    -0.06
     English
    -0.06
     Netflix
    -0.06
    Bitmap
    -0.06
    Buttons
    -0.06
     sizes
    -0.06
    Walk
    -0.06
    POSITIVE LOGITS
     joints
    0.12
     joint
    0.12
     Joint
    0.11
    joint
    0.10
    _joint
    0.10
    аст
    0.08
    0.07
    Joint
    0.07
    oje
    0.07
     JOHN
    0.07
    Act Density 0.004%

    No Known Activations