INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     نج
    0.46
     פ
    0.44
    0.42
     అవ
    0.41
     וח
    0.41
     prins
    0.40
     беспо
    0.40
    0.39
    וני
    0.39
     прямоуго
    0.39
    POSITIVE LOGITS
     deuterium
    0.48
    ر
    0.46
    räger
    0.45
    äure
    0.45
     Nürnberg
    0.44
     Pinnacle
    0.44
    PYTHON
    0.44
     Pickett
    0.42
    subfigure
    0.42
    epub
    0.42
    Act Density 0.001%

    No Known Activations