INDEX
    Explanations

    synthesizers

    New Auto-Interp
    Negative Logits
    .output
    -0.08
    _y
    -0.07
    (plane
    -0.07
    -0.07
     ya
    -0.07
     Hoàng
    -0.07
     pun
    -0.07
     автомоб
    -0.06
     paw
    -0.06
     Approval
    -0.06
    POSITIVE LOGITS
         	
    0.06
    _User
    0.06
    lexible
    0.06
     terrorism
    0.06
     adds
    0.06
    -dist
    0.06
    nbsp
    0.06
     synth
    0.05
    otu
    0.05
    (dummy
    0.05
    Act Density 0.005%

    No Known Activations