INDEX
    Explanations

    technical documents

    New Auto-Interp
    Negative Logits
    -0.56
     for
    -0.55
     -
    -0.53
     (
    -0.53
     I
    -0.52
     in
    -0.50
     len
    -0.50
     bien
    -0.48
    <eos>
    -0.48
     en
    -0.47
    POSITIVE LOGITS
     Anſ
    1.43
     iſt
    1.41
     Theſe
    1.38
     Monfieur
    1.36
     Reſ
    1.35
     Jefus
    1.31
     whoſe
    1.31
     Majefty
    1.30
     Diſ
    1.28
     purpoſe
    1.28
    Act Density 0.224%

    No Known Activations