INDEX
    Explanations

    terms related to solutions and discussions of social or political issues

    New Auto-Interp
    Negative Logits
    ifo
    -0.16
     Sachs
    -0.15
    758
    -0.14
     Pulse
    -0.14
    allah
    -0.14
    .FLAG
    -0.14
    amat
    -0.14
    ırak
    -0.14
    (ListNode
    -0.13
    æĮ¯ãĤĬ
    -0.13
    POSITIVE LOGITS
     mix
    0.65
     equation
    0.60
    mix
    0.45
     Mix
    0.45
     Equation
    0.40
    Mix
    0.38
     equations
    0.38
     mixes
    0.37
     fold
    0.36
     MIX
    0.35
    Act Density 0.142%

    No Known Activations