INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Pyrid
    0.47
    Celltype
    0.47
     עצ
    0.43
    Attrition
    0.42
    الص
    0.41
     agregó
    0.41
     القدر
    0.41
    รั่ง
    0.40
    зульта
    0.40
    ScienceStudent
    0.39
    POSITIVE LOGITS
     {@
    0.50
     @
    0.47
     HelloWorld
    0.46
    create
    0.44
    **/
    0.44
     JUnit
    0.44
     axios
    0.43
    axios
    0.43
     s
    0.42
    0.41
    Act Density 0.005%

    No Known Activations