INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Vir
    -0.07
    לות
    -0.07
    .her
    -0.07
     taps
    -0.07
    -0.07
    Streams
    -0.07
     vita
    -0.07
     trials
    -0.07
    -0.06
    .adj
    -0.06
    POSITIVE LOGITS
     homic
    0.07
     adec
    0.07
     STA
    0.07
    eware
    0.07
    كني
    0.07
    reeNode
    0.06
    ebin
    0.06
    cente
    0.06
    ết
    0.06
    目的在于
    0.06
    Act Density 0.006%

    No Known Activations