INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    375
    -0.07
     kuchy
    -0.07
     Evans
    -0.07
    <Color
    -0.06
    'av
    -0.06
    <char
    -0.06
     domina
    -0.06
     italiane
    -0.06
     elephants
    -0.06
     Howard
    -0.06
    POSITIVE LOGITS
    (list
    0.15
    _LIST
    0.14
    .list
    0.12
    /list
    0.11
    -list
    0.10
    	list
    0.10
     list
    0.09
    list
    0.08
     LIST
    0.08
    LIST
    0.08
    Act Density 0.035%

    No Known Activations