INDEX
    Explanations

    references to numerical values or calculations

    New Auto-Interp
    Negative Logits
     Roskov
    -1.49
     myſelf
    -1.44
     Efq
    -1.40
    ſelf
    -1.33
     itſelf
    -1.33
    LookAnd
    -1.26
     ―――――
    -1.26
     raiſ
    -1.24
     Jefus
    -1.23
     themſelves
    -1.21
    POSITIVE LOGITS
    <eos>
    0.70
    ↵↵
    0.68
    0.66
    .
    0.62
     (
    0.61
    I
    0.60
    <strong>
    0.60
    x
    0.60
     .
    0.59
     as
    0.58
    Act Density 0.314%

    No Known Activations