INDEX
    Explanations

    This neuron identifies Wikipedia category labels that denote an athlete’s medal achievements (e.g. “Olympic silver medalists for X”).

    New Auto-Interp
    Negative Logits
    igne
    -0.07
    чим
    -0.06
    专业
    -0.06
    .serial
    -0.06
    itol
    -0.06
    φό
    -0.06
    ськими
    -0.06
    cano
    -0.06
    Mary
    -0.06
     Ronald
    -0.06
    POSITIVE LOGITS
     avere
    0.07
     delete
    0.06
     envis
    0.06
    ORIZATION
    0.06
    	for
    0.06
     Sugar
    0.06
     MOTOR
    0.06
     AREA
    0.06
     apply
    0.06
     hỗ
    0.06
    Act Density 0.001%

    No Known Activations