INDEX
    Explanations

    expressions related to personal opinions or viewpoints

    New Auto-Interp
    Negative Logits
    orian
    -0.17
    eri
    -0.16
    uras
    -0.16
    eree
    -0.15
    leta
    -0.15
    lsi
    -0.15
    issen
    -0.14
    lier
    -0.14
    ifestyles
    -0.14
    ey
    -0.13
    POSITIVE LOGITS
    ated
    0.32
    ATED
    0.24
    POSITE
    0.22
    aires
    0.20
    inions
    0.20
    naire
    0.19
    ative
    0.19
     formation
    0.18
    ating
    0.18
    naires
    0.18
    Act Density 0.012%

    No Known Activations