INDEX
    Explanations

    instances of the word "overhe" (and variations involving "he" and "rehe")

    New Auto-Interp
    Negative Logits
    n
    -0.25
    w
    -0.20
    l
    -0.19
    ss
    -0.18
    y
    -0.18
    la
    -0.17
    sg
    -0.17
    elop
    -0.17
    d
    -0.17
    nin
    -0.17
    POSITIVE LOGITS
    aring
    0.25
    oric
    0.24
    ated
    0.24
    uristic
    0.23
    ating
    0.23
    arts
    0.22
    ctic
    0.22
    aven
    0.21
    arsed
    0.21
    aling
    0.20
    Act Density 0.015%

    No Known Activations