INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.07
    	               
    -0.07
    _SCREEN
    -0.07
     phổ
    -0.07
    =cv
    -0.07
     آقای
    -0.07
    emme
    -0.06
    、彼
    -0.06
     vacation
    -0.06
     görüş
    -0.06
    POSITIVE LOGITS
     NAND
    0.07
     nil
    0.06
     valueForKey
    0.06
     Nord
    0.06
    KV
    0.06
    Thor
    0.06
    Attend
    0.06
     wavelength
    0.06
     winger
    0.06
    こと
    0.05
    Act Density 0.001%

    No Known Activations