INDEX
    Explanations

    occurrences of the letter 'n' in various contexts

    New Auto-Interp
    Negative Logits
     Ne
    -0.56
     Ni
    -0.54
     Ни
    -0.50
     NE
    -0.49
     Nie
    -0.49
     Ν
    -0.48
     NI
    -0.47
     Nik
    -0.46
     Nag
    -0.46
     ne
    -0.46
    POSITIVE LOGITS
    n
    1.52
    nan
    1.44
    nin
    1.39
    ned
    1.38
    nn
    1.38
    non
    1.36
    nas
    1.34
    nat
    1.31
    nes
    1.30
    nu
    1.30
    Act Density 0.672%

    No Known Activations