INDEX
    Explanations

    segments related to measurements or quantities

    New Auto-Interp
    Negative Logits
    mybatisplus
    -0.89
    DOCTYPE
    -0.85
    InjectAttribute
    -0.84
    Portály
    -0.83
    NUMX
    -0.82
    argout
    -0.81
    rachtet
    -0.79
     وتسجيلات
    -0.78
    '},
    
    -0.78
    σθαι
    -0.78
    POSITIVE LOGITS
    ↵↵↵
    0.76
    ↵↵
    0.75
    [toxicity=0]
    0.69
      
    0.68
    ↵↵↵↵
    0.68
    0.67
    hline
    0.63
    ↵↵↵↵↵
    0.62
    "
    0.59
     
    0.57
    Act Density 0.110%

    No Known Activations