INDEX
    Explanations

    expressions of comparison or specificity, particularly involving the words "this", "that", and related phrases

    "this," "that," or "that kind of"

    New Auto-Interp
    Negative Logits
    værende
    -0.48
    жется
    -0.48
     Du
    -0.47
    发表于
    -0.45
    AddWithValue
    -0.44
     иначе
    -0.42
    kust
    -0.41
    izzano
    -0.40
    wes
    -0.40
     Nes
    -0.39
    POSITIVE LOGITS
     kind
    1.20
     kinds
    0.98
     type
    0.96
     sort
    0.94
     level
    0.90
    kind
    0.89
    kinds
    0.89
     amount
    0.86
     sorts
    0.85
     soort
    0.83
    Act Density 0.297%

    No Known Activations