INDEX
    Explanations

    scientific and technical texts

    New Auto-Interp
    Negative Logits
    )";
    
    -0.83
    })*/
    -0.78
     itſelf
    -0.77
    "],
    
    -0.75
    .)}
    -0.74
    "]];
    -0.73
    ſelves
    -0.72
    '},
    
    -0.71
    "},
    
    -0.71
    ſelf
    -0.71
    POSITIVE LOGITS
    i
    0.98
    ly
    0.71
    ed
    0.71
    y
    0.66
    ec
    0.66
    ii
    0.65
    ige
    0.64
    es
    0.63
    iu
    0.63
    a
    0.61
    Act Density 0.942%

    No Known Activations