INDEX
    Explanations

    specific mathematical notation and structure in equations

    New Auto-Interp
    Negative Logits
     Matth
    -0.69
     McC
    -0.67
     Wall
    -0.62
     sco
    -0.59
    вест
    -0.58
    θρω
    -0.58
    ine
    -0.57
     Helms
    -0.57
     Cannon
    -0.56
    locals
    -0.56
    POSITIVE LOGITS
    )]
    1.73
    )}
    1.42
    ")]
    1.13
    ))]
    1.08
    )]$
    1.08
    )</
    1.08
    )]
    
    1.04
     )}
    1.03
    )}\
    1.01
    )]:
    1.00
    Act Density 0.328%

    No Known Activations