INDEX
    Explanations

    long sequences of capital letters

    repeated mentions of the word "Long."

    New Auto-Interp
    Negative Logits
     babys
    -0.68
     seating
    -0.67
     acting
    -0.67
     cabinet
    -0.66
     applicable
    -0.65
     trust
    -0.64
     recept
    -0.63
     availability
    -0.62
     attending
    -0.62
     Arch
    -0.62
    POSITIVE LOGITS
    Long
    3.59
    long
    2.07
    Short
    1.96
     LONG
    1.69
     Long
    1.65
    Little
    1.41
    short
    1.33
    Large
    1.30
    Old
    1.29
    Length
    1.28
    Act Density 0.016%

    No Known Activations