INDEX
    Explanations

    instances of actions or events taking place

    instances of the verb "come" and related phrases indicating occurrences or events

    New Auto-Interp
    Negative Logits
     violate
    -0.70
     violates
    -0.67
    yth
    -0.65
     rejects
    -0.60
    soType
    -0.57
    ink
    -0.57
     contam
    -0.56
    obal
    -0.55
    wd
    -0.55
     happiest
    -0.55
    POSITIVE LOGITS
    20439
    0.70
    raphics
    0.62
    taining
    0.61
    hra
    0.61
    arde
    0.61
    when
    0.60
    ventus
    0.60
    éĹĺ
    0.58
     RTX
    0.56
    uitous
    0.55
    Act Density 0.210%

    No Known Activations