INDEX
    Explanations

    references to survival and danger in narratives

    New Auto-Interp
    Negative Logits
    itudes
    -0.19
    cciones
    -0.18
    ções
    -0.18
    rella
    -0.17
    udes
    -0.17
    thers
    -0.17
    uela
    -0.17
    nds
    -0.16
    uries
    -0.16
    enza
    -0.16
    POSITIVE LOGITS
    cimiento
    0.23
    amento
    0.23
    imiento
    0.19
    acimiento
    0.19
    issement
    0.19
    isme
    0.18
    amiento
    0.18
    onnement
    0.17
    ogram
    0.17
    Ī
    0.17
    Act Density 0.095%

    No Known Activations