INDEX
    Explanations

    expressions of surprise or disbelief

    New Auto-Interp
    Negative Logits
     inquiries
    -0.52
     assuming
    -0.51
    ceres
    -0.50
    piac
    -0.49
     Pais
    -0.47
     Normdatei
    -0.47
    Smiles
    -0.47
    Discografia
    -0.46
    MarshalTo
    -0.46
     hypothetical
    -0.45
    POSITIVE LOGITS
    rungsseite
    0.76
     /\.
    0.64
    новниш
    0.60
    !
    0.55
    eleste
    0.55
    😭😭
    0.54
    Espèce
    0.54
     réfugi
    0.52
    ?!
    0.52
    Tikang
    0.52
    Act Density 0.157%

    No Known Activations