INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ضوع
    -0.07
     Homer
    -0.07
    hoo
    -0.06
    Ada
    -0.06
     Cathedral
    -0.06
    cta
    -0.06
     Gotham
    -0.06
     tome
    -0.06
    >A
    -0.06
     Karn
    -0.06
    POSITIVE LOGITS
     while
    0.12
     While
    0.10
    While
    0.10
    	while
    0.09
     wipes
    0.08
     mientras
    0.08
    while
    0.08
     whilst
    0.08
     wipe
    0.07
    "While
    0.07
    Act Density 0.055%

    No Known Activations