INDEX
    Explanations

    defining python functions

    New Auto-Interp
    Negative Logits
     “[
    -1.14
    -1.05
     []
    
    -1.00
     '"+
    -1.00
    ְּ
    -0.96
     '-')
    -0.96
     befindet
    -0.95
    .$
    -0.94
    “[
    -0.94
    ,@
    -0.93
    POSITIVE LOGITS
    ():
    3.13
    ():
    
    2.88
     ():
    2.05
     (
    2.03
    ):
    
    1.97
    '):
    1.94
    ):
    1.91
    ()):
    1.88
     ):
    1.75
    (
    1.71
    Act Density 0.019%

    No Known Activations