INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ">';
    ↵
    -0.07
    voř
    -0.07
    ưở
    -0.07
     مدير
    -0.07
    уру
    -0.07
    -0.06
     fifo
    -0.06
    που
    -0.06
    _PAIR
    -0.06
     지도
    -0.06
    POSITIVE LOGITS
     ox
    0.18
     Ox
    0.15
    ox
    0.13
     Cox
    0.12
    cox
    0.09
     oxy
    0.08
     Vox
    0.08
    =========
    0.08
     oxidative
    0.07
     oxidation
    0.07
    Act Density 0.006%

    No Known Activations