INDEX
    Explanations

    expressions focusing on personal experiences and narratives

    New Auto-Interp
    Negative Logits
    lective
    -0.53
    urable
    -0.52
    ebi
    -0.51
     Maury
    -0.50
     dié
    -0.50
    ("")]
    -0.48
     pelican
    -0.48
     noma
    -0.48
     approximate
    -0.47
     wearer
    -0.47
    POSITIVE LOGITS
    +#+
    0.77
    MockBean
    0.73
     weird
    0.70
    Enllaces
    0.69
    Anyways
    0.67
     disant
    0.66
    guys
    0.65
    weird
    0.64
     Biôgrafia
    0.64
     guy
    0.64
    Act Density 0.091%

    No Known Activations