INDEX
    Explanations

    The neuron activates on the model’s self-referential first-person statements (e.g. “I’m,” “my,” “I’m your…”).

    New Auto-Interp
    Negative Logits
    .geom
    -0.07
     chai
    -0.06
     GMO
    -0.06
    atıcı
    -0.06
    	strcpy
    -0.06
     conclusive
    -0.06
     Dj
    -0.06
    _minimum
    -0.06
    Projectile
    -0.06
    Delayed
    -0.06
    POSITIVE LOGITS
     feder
    0.08
    weg
    0.07
    SenderId
    0.06
     strerror
    0.06
    тот
    0.06
     clinically
    0.06
    .distance
    0.06
     redesigned
    0.06
    greater
    0.06
     Кам
    0.06
    Act Density 0.011%

    No Known Activations