INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     lack
    0.63
     versatility
    0.57
     technological
    0.57
     overlap
    0.56
     slogan
    0.55
     ubiqu
    0.55
     monopol
    0.52
     fanc
    0.52
     underpin
    0.51
     crew
    0.51
    POSITIVE LOGITS
    Diabetes
    0.59
     пациен
    0.59
     и
    0.57
    You
    0.56
    ضي
    0.55
    0.55
    Transforms
    0.54
    0.54
     পদক্ষেপ
    0.53
    Procedure
    0.53
    Act Density 0.001%

    No Known Activations