INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     ―――――
    -2.05
     ་་
    -1.99
     myſelf
    -1.95
     Jefus
    -1.94
     Efq
    -1.87
     doubtnut
    -1.84
     Houſe
    -1.80
     Monfieur
    -1.79
     ſind
    -1.79
     Anſ
    -1.77
    POSITIVE LOGITS
     the
    1.18
    .
    1.05
    <eos>
    1.05
     (
    1.02
     .
    0.90
    ↵↵
    0.89
    0.89
     a
    0.88
     in
    0.88
      
    0.87
    Act Density 1.206%

    No Known Activations