INDEX
    Explanations

    the introduction of significant concepts or topics within the text

    New Auto-Interp
    Negative Logits
     EconPapers
    -1.29
     bezeichneter
    -1.21
     Efq
    -1.17
     myſelf
    -1.16
     itſelf
    -1.15
     pleaſure
    -1.14
     raiſ
    -1.13
     purpoſe
    -1.12
    abestanden
    -1.11
     ―――――
    -1.06
    POSITIVE LOGITS
    0.70
    ↵↵
    0.69
    0.62
     "
    0.60
    c
    0.57
    1
    0.57
    ↵↵↵
    0.56
     ...
    0.56
    </em>
    0.55
    "
    0.55
    Act Density 0.008%

    No Known Activations