INDEX
    Explanations

    mentions of time, specifically in Spanish, such as hours or time-related expressions

    references to the word "las" and its variations

    New Auto-Interp
    Negative Logits
    otions
    -0.79
    ebus
    -0.75
     Nanto
    -0.74
    eur
    -0.73
    acle
    -0.71
    otional
    -0.69
    Reviewer
    -0.68
    pter
    -0.67
    ahs
    -0.66
    acles
    -0.66
    POSITIVE LOGITS
    aurus
    0.95
    agne
    0.84
    peed
    0.80
    agna
    0.79
     Takeru
    0.79
    TER
    0.77
    quez
    0.76
     redes
    0.74
    ross
    0.74
    arbon
    0.73
    Act Density 0.013%

    No Known Activations