INDEX
    Explanations

    I'm sorry, but it seems there was an issue with the text for Neuron 4. Would you be able to provide the correct text for Neuron 4 activations so that I can analyze it for you?

    New Auto-Interp
    Negative Logits
    rules
    -0.66
     Allied
    -0.60
     Clair
    -0.59
     branches
    -0.58
     electromagnetic
    -0.58
     PTS
    -0.57
     tides
    -0.56
     Malone
    -0.56
     uniform
    -0.56
     excess
    -0.55
    POSITIVE LOGITS
    'm
    1.40
    've
    1.26
    stanbul
    1.24
    nex
    1.23
    EEE
    1.19
    zzy
    1.11
     suppose
    1.05
    'll
    1.03
    ANS
    1.02
    ronic
    1.01
    Act Density 0.206%

    No Known Activations