INDEX
    Explanations

    phrases that exhibit admiration or positive sentiment towards topics, using a specific grammatical structure involving articles and exclamatory expressions

    New Auto-Interp
    Negative Logits
    atura
    -0.17
     Shed
    -0.15
    oku
    -0.15
    undy
    -0.15
     Graz
    -0.14
    pute
    -0.14
    ista
    -0.14
    ÑĨÑĥ
    -0.14
    273
    -0.14
    ắt
    -0.14
    POSITIVE LOGITS
     difference
    0.17
    ehr
    0.17
     pity
    0.17
    eck
    0.16
     shame
    0.16
    contrast
    0.16
     contrast
    0.16
     coincidence
    0.15
     waste
    0.15
    sis
    0.15
    Act Density 0.010%

    No Known Activations