INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     machine
    -0.07
     machines
    -0.07
    Credit
    -0.07
    inite
    -0.07
    (bucket
    -0.06
     machen
    -0.06
     jet
    -0.06
    Card
    -0.06
     Schmidt
    -0.06
                                                                               
    -0.06
    POSITIVE LOGITS
     audio
    0.06
     MouseButton
    0.06
     лицо
    0.06
    0.06
    _ALLOWED
    0.06
     slaughtered
    0.06
    公共
    0.06
    以为
    0.06
    ณะ
    0.06
     applause
    0.06
    Act Density 0.005%

    No Known Activations