INDEX
    Explanations

    expressions of love and positive feelings towards experiences or objects

    New Auto-Interp
    Negative Logits
    unas
    -0.16
    ump
    -0.15
    sect
    -0.15
    372
    -0.15
    боÑĢ
    -0.15
    QUE
    -0.15
    um
    -0.15
    .xz
    -0.15
    uman
    -0.15
    real
    -0.14
    POSITIVE LOGITS
     plung
    0.15
    full
    0.15
    plain
    0.14
    fat
    0.14
    birds
    0.14
    .cbo
    0.14
    uÃŃ
    0.14
     Volk
    0.14
    ably
    0.14
    -filled
    0.14
    Act Density 0.028%

    No Known Activations