INDEX
    Explanations

    occurrences of the name "Alan" in various contexts

    New Auto-Interp
    Negative Logits
     weight
    -0.45
     these
    -0.44
     they
    -0.42
     These
    -0.38
     kautta
    -0.35
    These
    -0.33
     sign
    -0.33
     we
    -0.32
     respectively
    -0.32
    дир
    -0.32
    POSITIVE LOGITS
     Alan
    2.05
    Alan
    2.03
     ALAN
    1.62
     alan
    1.38
    ALAN
    1.20
    alan
    1.03
    alanine
    1.02
     Allan
    1.02
    Allan
    0.96
     Alain
    0.96
    Act Density 0.003%

    No Known Activations