INDEX
    Explanations

    numbers related to 2

    New Auto-Interp
    Negative Logits
     Heng
    -0.08
     Hopper
    -0.08
     Pitt
    -0.08
     apare
    -0.08
    HDR
    -0.08
     Stav
    -0.08
     Lynch
    -0.07
     Harrison
    -0.07
     anaer
    -0.07
     HDR
    -0.07
    POSITIVE LOGITS
    (Bit
    0.09
    udo
    0.08
    _analysis
    0.08
    Sleeping
    0.08
     thy
    0.07
    (struct
    0.07
    Compress
    0.07
    pal
    0.07
     raconte
    0.07
    -proof
    0.07
    Act Density 0.004%

    No Known Activations