INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     snippetHide
    -0.91
     démocr
    -0.85
    GraphicsUnit
    -0.85
     auroit
    -0.85
     feroit
    -0.83
     Efq
    -0.82
     purpoſe
    -0.82
     vérit
    -0.82
     pouvoit
    -0.80
     enfans
    -0.77
    POSITIVE LOGITS
    y
    0.60
    ,
    0.59
    a
    0.57
     (
    0.56
    s
    0.54
    e
    0.53
    i
    0.50
    .
    0.48
    er
    0.48
    aid
    0.48
    Act Density 0.039%

    No Known Activations