INDEX
    Explanations

    quotation marks and string delimiters in code or text

    New Auto-Interp
    Negative Logits
    '));
    
    -0.98
    ]');
    -0.87
    )";
    
    -0.86
    %</
    -0.82
    .";
    
    -0.81
    `,
    
    -0.79
    '):
    
    -0.79
    }');
    -0.78
    )");
    
    -0.78
    '],
    
    -0.77
    POSITIVE LOGITS
    ("
    1.49
    ]=="
    1.43
    :@"
    1.35
    ="
    1.32
    !("
    1.30
     "
    1.27
    "=>"
    1.21
    1.18
    ]["
    1.17
    _("
    1.16
    Act Density 0.283%

    No Known Activations