INDEX
    Explanations

    mentions of the word "Princess" at varying intensities, possibly related to different contexts or relationships

    references to various princesses

    New Auto-Interp
    Negative Logits
     spaced
    -0.69
    sych
    -0.68
    ophon
    -0.65
    ulhu
    -0.63
     neur
    -0.62
    ucl
    -0.62
    rils
    -0.61
     funn
    -0.61
    appa
    -0.61
    oller
    -0.60
    POSITIVE LOGITS
     Leia
    1.12
     Bride
    1.10
     Celest
    1.01
     Diana
    0.97
    anova
    0.97
     Princess
    0.95
     Peach
    0.92
     princess
    0.88
     Fiona
    0.84
    cess
    0.83
    Act Density 0.029%

    No Known Activations