INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     juris
    -0.07
     Ridge
    -0.07
    uur
    -0.06
     signings
    -0.06
     nová
    -0.06
     Lisa
    -0.06
    -0.06
     isteyen
    -0.06
     technik
    -0.06
     ولد
    -0.06
    POSITIVE LOGITS
    0.07
    cle
    0.06
    
    0.06
    0.06
    0.06
    .Protocol
    0.06
     Xuân
    0.06
     Polo
    0.06
     ach
    0.06
    ông
    0.06
    Act Density 0.004%

    No Known Activations