INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Efq
    -1.00
     myſelf
    -0.91
     MainAxisSize
    -0.90
     unknownFields
    -0.84
     kaynağından
    -0.84
     Theſe
    -0.82
     Monfieur
    -0.81
     Anſ
    -0.80
     whoſe
    -0.78
    kranz
    -0.77
    POSITIVE LOGITS
    e
    0.63
    E
    0.53
     e
    0.50
    the
    0.50
    <bos>
    0.50
     principal
    0.48
    ↵↵
    0.48
    D
    0.48
     E
    0.47
     main
    0.47
    Act Density 0.097%

    No Known Activations