INDEX
    Explanations

    phrases related to causality or influence

    phrases that indicate cause and effect relationships

    New Auto-Interp
    Negative Logits
    arily
    -0.78
     todd
    -0.72
     toured
    -0.71
    ta
    -0.70
    tes
    -0.69
    alian
    -0.68
    tel
    -0.66
    right
    -0.66
    ared
    -0.66
     cared
    -0.66
    POSITIVE LOGITS
     confusion
    0.87
     confirmation
    0.78
     speculation
    0.77
     bloodshed
    0.77
     dismissal
    0.77
     extinction
    0.76
     widespread
    0.76
     stagnation
    0.76
     outbreaks
    0.75
     deterioration
    0.73
    Act Density 0.067%

    No Known Activations