INDEX
    Explanations

    Written text

    The neuron activates on words and phrases that signal causal relationships (e.g. “caused,” “due to,” “causes,” etc.).

    New Auto-Interp
    Negative Logits
    	queue
    -0.07
    .Time
    -0.06
    ynchron
    -0.06
    .”↵
    -0.06
    .More
    -0.06
    BILE
    -0.06
    たら
    -0.06
    /G
    -0.06
     studio
    -0.06
     recurrence
    -0.06
    POSITIVE LOGITS
    ливі
    0.06
    toMatch
    0.06
    ogr
    0.06
    Particles
    0.06
     Morrow
    0.06
    iale
    0.06
    lm
    0.06
     Predator
    0.06
     название
    0.06
    aydı
    0.06
    Act Density 0.039%

    No Known Activations