INDEX
    Explanations

    names related to the topic at hand

    references to personal identity or self-referential phrases

    New Auto-Interp
    Negative Logits
    rama
    -0.79
    hips
    -0.78
    olor
    -0.74
    rieved
    -0.74
    iosity
    -0.74
    roads
    -0.71
    iary
    -0.70
    aughters
    -0.70
    inus
    -0.69
    ulation
    -0.68
    POSITIVE LOGITS
    asure
    1.15
    anwhile
    1.11
    lda
    1.04
    zzo
    0.99
    leon
    0.98
    ister
    0.91
    asuring
    0.86
    eting
    0.85
    asured
    0.84
    gging
    0.83
    Act Density 0.017%

    No Known Activations