INDEX
    Explanations

    abbreviated scientific terminology and acronyms

    New Auto-Interp
    Negative Logits
    the
    -0.65
    theore
    -0.63
    ۔
    -0.61
    tradition
    -0.60
    tafel
    -0.53
     raiſ
    -0.53
     myſelf
    -0.53
    they
    -0.53
     deſt
    -0.52
    ization
    -0.52
    POSITIVE LOGITS
    sies
    0.45
    hes
    0.44
    ses
    0.44
    ais
    0.43
    rs
    0.43
    r
    0.42
    dos
    0.42
    tis
    0.41
    es
    0.41
    tic
    0.41
    Act Density 1.247%

    No Known Activations