INDEX
Explanations
occurrences of the letter 'n' in various contexts
New Auto-Interp
Negative Logits
Ne
-0.56
Ni
-0.54
Ни
-0.50
NE
-0.49
Nie
-0.49
Ν
-0.48
NI
-0.47
Nik
-0.46
Nag
-0.46
ne
-0.46
POSITIVE LOGITS
n
1.52
nan
1.44
nin
1.39
ned
1.38
nn
1.38
non
1.36
nas
1.34
nat
1.31
nes
1.30
nu
1.30
Activations Density 0.672%