INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     maka
    -0.08
    badge
    -0.08
     erforder
    -0.08
     picker
    -0.08
     atributo
    -0.08
    ومن
    -0.08
    هيز
    -0.07
     виж
    -0.07
     catered
    -0.07
    看来
    -0.07
    POSITIVE LOGITS
     regulators
    0.09
    0.08
     regulation
    0.08
     layers
    0.08
     repress
    0.08
    াধীন
    0.08
     regulating
    0.08
    0.08
     Determ
    0.08
     interplay
    0.07
    Act Density 0.004%

    No Known Activations