INDEX
    Explanations

    references to significant or notable individuals and their contributions

    New Auto-Interp
    Negative Logits
     (
    -0.33
    ...(
    -0.30
    ....
    -0.25
     (...
    -0.24
    --
    -0.24
     (&
    -0.24
    ---
    -0.24
    ...
    -0.24
     (~
    -0.23
     -
    -0.22
    POSITIVE LOGITS
     celebrity
    0.42
     Celebrity
    0.40
    Celebr
    0.27
     celebrities
    0.25
     cele
    0.23
    0.23
    cele
    0.23
     Cele
    0.22
    ”.↵
    0.20
     —↵
    0.20
    Act Density 0.016%

    No Known Activations