INDEX
    Explanations

    technical formatting and code

    New Auto-Interp
    Negative Logits
    <unused544>
    0.58
    squarePos
    0.55
    <unused609>
    0.55
    𝚋
    0.54
    <unused291>
    0.52
    <unused301>
    0.52
    <unused657>
    0.52
    𒋼
    0.52
    <unused192>
    0.52
    <unused2013>
    0.51
    POSITIVE LOGITS
     
    0.58
     guide
    0.51
     medical
    0.48
     light
    0.48
    ,
    0.47
     power
    0.46
     continuum
    0.45
    /
    0.45
     wilderness
    0.44
     history
    0.44
    Act Density 0.000%

    No Known Activations