INDEX
    Explanations

    expressions of approval or positivity

    New Auto-Interp
    Negative Logits
     disambiguazione
    -0.71
     Савезне
    -0.68
     informée
    -0.64
     étoit
    -0.62
     estekak
    -0.60
     avoient
    -0.60
    afficheront
    -0.60
     propOrder
    -0.60
     ujednoznacz
    -0.59
    parsedMessage
    -0.58
    POSITIVE LOGITS
     Niche
    0.72
     niche
    0.64
     counters
    0.58
     dem
    0.57
     confidential
    0.55
    ache
    0.54
     catch
    0.52
     garage
    0.52
     green
    0.51
     pos
    0.50
    Act Density 0.686%

    No Known Activations