INDEX
    Explanations

    punctuation marks and formatting cues in the text

    New Auto-Interp
    Negative Logits
    -0.64
     O
    -0.58
    -
    -0.56
     sk
    -0.54
     z
    -0.54
    tra
    -0.52
     mor
    -0.51
     Sch
    -0.51
     I
    -0.51
     ST
    -0.50
    POSITIVE LOGITS
     pleaſure
    1.31
    GEBURTSDATUM
    1.19
     purpoſe
    1.14
     houſe
    1.14
     myſelf
    1.13
    ſelf
    1.12
     anſ
    1.10
     raiſ
    1.10
     diſt
    1.10
    UnsafeEnabled
    1.08
    Act Density 0.301%

    No Known Activations