INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    
    -0.09
    PTS
    -0.08
    Со
    -0.08
    명을
    -0.07
     것을
    -0.07
     fost
    -0.07
    Eti
    -0.07
     prip
    -0.07
    kti
    -0.07
    
    -0.07
    POSITIVE LOGITS
    /etc
    0.14
     ஆகிய
    0.08
     Sieg
    0.08
     amante
    0.07
     accountant
    0.07
    /_
    0.07
     crow
    0.07
     Alexand
    0.07
    ,etc
    0.07
     souff
    0.07
    Act Density 0.168%

    No Known Activations