INDEX
    Explanations

    instances of code and programming syntax

    New Auto-Interp
    Negative Logits
     for
    -0.26
     first
    -0.20
     to
    -0.20
     not
    -0.20
     with
    -0.20
     on
    -0.20
     and
    -0.19
     in
    -0.19
     if
    -0.18
     do
    -0.18
    POSITIVE LOGITS
    	
    0.32
    )↵↵
    0.24
    0.21
    	v
    0.18
    .core
    0.18
    _core
    0.18
    -Core
    0.18
    -core
    0.17
    	g
    0.17
    	u
    0.17
    Act Density 0.004%

    No Known Activations