INDEX
    Explanations

    expressions of personal feelings and introspection

    New Auto-Interp
    Negative Logits
     maneu
    -0.77
     strick
    -0.71
     attemp
    -0.70
     lgbt
    -0.68
     horrend
    -0.64
     shenan
    -0.64
     toledo
    -0.64
     encomp
    -0.63
     increa
    -0.62
     resear
    -0.61
    POSITIVE LOGITS
     feel
    0.56
    ългария
    0.55
     felt
    0.54
     feels
    0.52
    feel
    0.50
     feeling
    0.49
     Feel
    0.47
     sento
    0.47
     sinto
    0.47
     πως
    0.46
    Act Density 0.125%

    No Known Activations