INDEX
    Explanations

    occurrences of the letter 'P' in various contexts

    New Auto-Interp
    Negative Logits
    etes
    -0.16
    688
    -0.16
    ether
    -0.15
     Rule
    -0.15
    descr
    -0.14
    551
    -0.14
    ort
    -0.14
    soc
    -0.14
    ray
    -0.14
    eness
    -0.14
    POSITIVE LOGITS
    fer
    0.28
    ioni
    0.24
    fe
    0.22
    fad
    0.21
    fort
    0.21
    far
    0.21
    fade
    0.20
    fl
    0.20
    fa
    0.20
    fal
    0.19
    Act Density 0.008%

    No Known Activations