INDEX
    Explanations

    names or proper nouns

    suffixes or fragments of words

    New Auto-Interp
    Negative Logits
     Stella
    -0.63
     Paula
    -0.60
    Nar
    -0.59
     Vengeance
    -0.58
     faire
    -0.57
     BUR
    -0.54
     Psycho
    -0.54
     Lily
    -0.54
    CAP
    -0.54
     Pix
    -0.53
    POSITIVE LOGITS
    enegger
    1.12
     himself
    1.04
     testified
    0.98
     oversaw
    0.90
     wrote
    0.88
    's
    0.87
     penned
    0.87
    baum
    0.85
     told
    0.85
     specializes
    0.84
    Act Density 0.166%

    No Known Activations