INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    .
    -1.05
     to
    -0.98
    ,
    -0.90
    -
    -0.87
     and
    -0.81
    ↵↵
    -0.77
     that
    -0.73
     (
    -0.71
    to
    -0.65
     in
    -0.65
    POSITIVE LOGITS
     myſelf
    1.59
     itſelf
    1.58
     ―――――
    1.53
     Efq
    1.47
     Theſe
    1.42
     themſelves
    1.36
     Jefus
    1.36
     Monfieur
    1.35
     iſt
    1.33
     himſelf
    1.33
    Act Density 0.162%

    No Known Activations