INDEX
    Explanations

    mentions of the social media platform Facebook

    New Auto-Interp
    Negative Logits
    <bos>
    -1.38
     auffi
    -0.92
     comigo
    -0.90
     Efq
    -0.83
     romántico
    -0.81
     ainfi
    -0.81
     polaire
    -0.80
     decorativa
    -0.77
     engraçadas
    -0.77
     mukaan
    -0.76
    POSITIVE LOGITS
     propOrder
    0.82
     digress
    0.59
     the
    0.59
    Revenir
    0.58
     `
    0.57
     super
    0.56
     its
    0.55
    LookAnd
    0.55
    ERIES
    0.55
    OrNil
    0.54
    Act Density 0.908%

    No Known Activations