INDEX
    Explanations

    repetitive phrases or structures involving the word "of"

    New Auto-Interp
    Negative Logits
     faſt
    -0.85
     ſta
    -0.84
     juſ
    -0.79
     pleaſure
    -0.79
     ſche
    -0.79
     purpoſe
    -0.75
     viſ
    -0.74
     raiſ
    -0.73
     ſte
    -0.73
     ſtate
    -0.72
    POSITIVE LOGITS
     of
    1.88
     Of
    1.25
     OF
    1.25
    Of
    1.16
    of
    1.10
     của
    1.08
    ของ
    1.04
    オブ
    0.94
     ऑफ
    0.88
     של
    0.87
    Act Density 1.574%

    No Known Activations