INDEX
    Explanations

    Science and technology

    New Auto-Interp
    Negative Logits
     itſelf
    -0.92
     houſe
    -0.90
     Cæsar
    -0.88
     Efq
    -0.85
     Houſe
    -0.82
     dentaire
    -0.82
     ſta
    -0.82
     pleaſure
    -0.80
     purpoſe
    -0.77
     giapp
    -0.77
    POSITIVE LOGITS
    sen
    0.60
    entile
    0.60
    ting
    0.55
    phalt
    0.53
    ness
    0.52
    sing
    0.52
    ging
    0.51
    ture
    0.51
    ses
    0.48
    ders
    0.47
    Act Density 0.270%

    No Known Activations