INDEX
    Explanations

    words related to causal relationships or connections

    the word "that" in various contexts

    New Auto-Interp
    Negative Logits
    ron
    -0.68
    river
    -0.67
    english
    -0.61
    yne
    -0.60
    lander
    -0.59
    rss
    -0.59
    ctica
    -0.59
    Desk
    -0.59
    ner
    -0.59
    Kit
    -0.59
    POSITIVE LOGITS
     accompanies
    0.79
     fateful
    0.78
     preceded
    0.78
     caused
    0.76
     consumes
    0.75
     arose
    0.73
     spawned
    0.73
    ItemTracker
    0.72
     THEY
    0.71
     consumed
    0.70
    Act Density 0.216%

    No Known Activations