INDEX
    Explanations

    expressions of admiration or astonishment

    New Auto-Interp
    Negative Logits
    sillon
    -0.77
    lüğ
    -0.73
     emplois
    -0.73
    Carole
    -0.70
    yyah
    -0.70
     Carole
    -0.70
    cientos
    -0.70
     nostru
    -0.67
    uyla
    -0.65
     Barrow
    -0.65
    POSITIVE LOGITS
     DMG
    0.85
     Amaz
    0.83
     Ains
    0.83
     Vain
    0.81
     Erm
    0.81
    jectures
    0.79
     Flames
    0.78
    AMAZING
    0.78
     Lain
    0.77
     Chains
    0.77
    Act Density 0.090%

    No Known Activations