INDEX
    Explanations

    numeric values and references to measurements or levels

    New Auto-Interp
    Negative Logits
      
    -1.50
    ViewFeatures
    -1.08
     ​
    -1.02
    </b>
    -1.02
    SourceChecksum
    -0.95
    -0.94
     &
    -0.92
    !—
    -0.89
       
    -0.88
     ...
    
    -0.87
    POSITIVE LOGITS
     *
    0.64
    *
    0.64
     *,
    0.60
    *-
    0.59
    *[
    0.58
    *$
    0.55
    #{
    0.54
    *.
    0.53
    **/
    0.52
    *,
    0.52
    Act Density 0.024%

    No Known Activations