INDEX
    Explanations

    This neuron flags words appearing as quoted titles or names—that is, text inside quotation marks.

    New Auto-Interp
    Negative Logits
    confirmation
    -0.07
    incip
    -0.06
    /type
    -0.06
     текущ
    -0.06
    belum
    -0.06
    animation
    -0.06
    Girls
    -0.06
    سان
    -0.06
    -animation
    -0.06
    olik
    -0.06
    POSITIVE LOGITS
     пу
    0.08
    .Merge
    0.06
    0.06
    )();↵
    0.06
    hill
    0.06
     кім
    0.06
    ={[
    0.06
     compensate
    0.06
     аром
    0.06
    (delete
    0.06
    Act Density 0.109%

    No Known Activations