INDEX
    Explanations

    phrases indicating a specific event or action occurring at a particular time or location

    instances of the word "when" indicating temporal context in situations

    New Auto-Interp
    Negative Logits
    hack
    -0.68
    ha
    -0.65
    Grade
    -0.65
    hi
    -0.64
    less
    -0.62
    ictive
    -0.62
     nig
    -0.62
    Fine
    -0.61
    cut
    -0.60
    agin
    -0.58
    POSITIVE LOGITS
    soever
    0.90
    */(
    0.90
     confronted
    0.73
     encountering
    0.72
     they
    0.72
    irlf
    0.72
    ":[{"
    0.70
     asked
    0.70
     undergoing
    0.68
     someone
    0.66
    Act Density 0.083%

    No Known Activations