INDEX
    Explanations

    phrases related to imminent threats or dangers

    references to threats and impending dangers

    New Auto-Interp
    Negative Logits
    ãĤ¼
    -0.65
    ggies
    -0.65
    xx
    -0.63
    arus
    -0.63
    essen
    -0.61
    inations
    -0.60
    ilar
    -0.59
    earch
    -0.58
    utsche
    -0.57
    OH
    -0.57
    POSITIVE LOGITS
     omin
    1.08
     overhead
    1.06
     hovering
    0.98
     looming
    0.93
     menacing
    0.87
     above
    0.82
     haunting
    0.82
    rily
    0.81
     challeng
    0.78
     atop
    0.78
    Act Density 0.082%

    No Known Activations