INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     myſelf
    -0.98
    wiſe
    -0.96
    ity
    -0.96
     injection
    -0.95
     auffi
    -0.93
     Efq
    -0.93
    ſelves
    -0.91
    ics
    -0.88
     iſt
    -0.88
     ―――――
    -0.87
    POSITIVE LOGITS
    .
    0.71
    ,
    0.71
     (
    0.57
    :
    0.56
     with
    0.55
    0.53
    بوابة
    0.52
    TestingModule
    0.51
     I
    0.49
    <eos>
    0.48
    Act Density 0.103%

    No Known Activations