INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ="'.$
    0.38
    0.38
     крайне
    0.37
     κάτι
    0.36
    ऊंगी
    0.36
     horrend
    0.35
     rouges
    0.35
    互联网档案馆
    0.35
     leukocytes
    0.35
     piglets
    0.35
    POSITIVE LOGITS
    //
    1.45
     //
    1.26
    <!--
    0.95
    //
    0.90
     /*
    0.84
     <!--
    0.83
    /**
    0.80
    +//
    0.79
    #
    0.79
    ///
    0.79
    Act Density 0.538%

    No Known Activations