INDEX
    Explanations

    mentions of natural disasters, specifically wildfires

    references to wildfires or fire incidents

    New Auto-Interp
    Negative Logits
    Birth
    -0.75
    Tok
    -0.72
     Barcl
    -0.66
     Surgery
    -0.65
    WOR
    -0.63
    afort
    -0.62
    çİĭ
    -0.61
    GB
    -0.61
    conservative
    -0.61
    Students
    -0.61
    POSITIVE LOGITS
    hooting
    0.94
     fires
    0.92
    fires
    0.90
     torches
    0.88
    paces
    0.86
    linger
    0.85
     Fired
    0.84
     flares
    0.83
     blazing
    0.82
     retard
    0.81
    Act Density 0.007%

    No Known Activations