INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    hammer
    -0.07
     Matter
    -0.06
     ee
    -0.06
     heritage
    -0.06
     Desert
    -0.06
     contributor
    -0.06
     scanned
    -0.06
    serial
    -0.06
     Collector
    -0.06
    qua
    -0.06
    POSITIVE LOGITS
     недостат
    0.07
    いい
    0.06
    plet
    0.06
    로운
    0.06
     chopped
    0.06
     восстанов
    0.06
     باشگاه
    0.06
            
    ↵        
    ↵
    0.06
     polarity
    0.06
    يمكن
    0.06
    Act Density 0.327%

    No Known Activations