INDEX
    Explanations

    references to religious themes and figures

    New Auto-Interp
    Negative Logits
    uhn
    -0.15
    reten
    -0.14
    DEV
    -0.14
    eturn
    -0.14
    ullo
    -0.14
    529
    -0.14
    429
    -0.14
    orgot
    -0.13
    argent
    -0.13
     Vader
    -0.13
    POSITIVE LOGITS
     proph
    0.43
     prophet
    0.38
     prophecy
    0.35
     prop
    0.32
     prophets
    0.31
     Prophet
    0.30
     prediction
    0.29
     predictions
    0.29
     predicting
    0.28
     predictive
    0.28
    Act Density 0.221%

    No Known Activations