INDEX
Explanations
the letter 'r' followed by another specific letter sequence
instances of the letter 'r'
New Auto-Interp
Negative Logits
terson
-0.74
GOODMAN
-0.74
Isles
-0.72
Fargo
-0.68
compe
-0.67
Lobby
-0.67
Nieto
-0.65
Spartan
-0.65
Sons
-0.65
renheit
-0.64
POSITIVE LOGITS
umbling
1.23
ipples
1.21
umbled
1.18
acking
1.17
ascal
1.16
angers
1.16
ipp
1.15
attle
1.15
iddles
1.15
ussia
1.14
Activations Density 0.022%