INDEX
Explanations
occurrences of the name "Alan" in various contexts
Alan and alanine
New Auto-Interp
Negative Logits
weight
-0.45
these
-0.44
they
-0.42
These
-0.38
kautta
-0.35
These
-0.33
sign
-0.33
we
-0.32
respectively
-0.32
дир
-0.32
POSITIVE LOGITS
Alan
2.05
Alan
2.03
ALAN
1.62
alan
1.38
ALAN
1.20
alan
1.03
alanine
1.02
Allan
1.02
Allan
0.96
Alain
0.96
Activations Density 0.003%