INDEX
    Explanations

    words related to catastrophic and disastrous events

    New Auto-Interp
    Negative Logits
    pton
    -0.80
    yip
    -0.77
    chy
    -0.73
    pei
    -0.72
    plet
    -0.71
    perty
    -0.70
    kson
    -0.70
    ptive
    -0.67
    pared
    -0.67
    pheus
    -0.67
    POSITIVE LOGITS
    rophe
    1.28
    rophic
    1.07
    roph
    0.99
    efully
    0.94
    ruct
    0.93
    ream
    0.93
    eful
    0.91
    rup
    0.89
    odon
    0.88
    rike
    0.87
    Act Density 0.028%

    No Known Activations