INDEX
    Explanations

    terms associated with influence and definition

    New Auto-Interp
    Negative Logits
    iphate
    -0.71
    DERR
    -0.70
    idel
    -0.69
    ighters
    -0.69
    igers
    -0.67
    ãĥı
    -0.67
    intend
    -0.67
    aeus
    -0.65
    sels
    -0.64
    leased
    -0.64
    POSITIVE LOGITS
     contemporary
    0.85
     everything
    0.80
     our
    0.78
     discussions
    0.78
     debates
    0.77
     both
    0.75
     modern
    0.74
     anthropology
    0.72
     colonialism
    0.72
     many
    0.71
    Act Density 0.071%

    No Known Activations