INDEX
    Explanations

    mentions of preferences or desires in the text

    New Auto-Interp
    Negative Logits
     createSlice
    -1.02
    IntoConstraints
    -0.99
     AssemblyCulture
    -0.95
    存于互联网档案馆
    -0.94
     سكانية
    -0.92
    complexContent
    -0.91
    NameInMap
    -0.90
    writeFieldEnd
    -0.90
    WithIOException
    -0.88
    Erreferentziak
    -0.86
    POSITIVE LOGITS
     likes
    1.25
     needs
    0.77
     wants
    0.74
     loves
    0.69
    likes
    0.66
     NEEDS
    0.63
     Needs
    0.60
     Likes
    0.60
     knows
    0.57
     Wants
    0.56
    Act Density 0.210%

    No Known Activations