INDEX
Explanations
occurrences of the letter 'P' in various contexts
New Auto-Interp
Negative Logits
etes
-0.16
688
-0.16
ether
-0.15
Rule
-0.15
descr
-0.14
551
-0.14
ort
-0.14
soc
-0.14
ray
-0.14
eness
-0.14
POSITIVE LOGITS
fer
0.28
ioni
0.24
fe
0.22
fad
0.21
fort
0.21
far
0.21
fade
0.20
fl
0.20
fa
0.20
fal
0.19
Activations Density 0.008%