INDEX
    Explanations

    expressions of genuine admiration or authenticity

    New Auto-Interp
    Negative Logits
    rous
    -0.15
    sel
    -0.15
     Briggs
    -0.15
    ses
    -0.15
    .au
    -0.15
     mere
    -0.14
    lot
    -0.14
    strand
    -0.14
    angan
    -0.14
    ains
    -0.14
    POSITIVE LOGITS
    /false
    0.28
    -blue
    0.21
    -life
    0.21
    ignment
    0.19
    fully
    0.17
     truly
    0.15
    born
    0.15
    474
    0.15
    -ÑĤаки
    0.15
    ajan
    0.14
    Act Density 0.022%

    No Known Activations