INDEX
    Explanations

    it seems to be detecting specific phrases or keywords, but based solely on the given activations provided, it's not clear what the neuron is specifically looking for

    mentions of economic concepts or factors

    New Auto-Interp
    Negative Logits
     subpoena
    -0.86
     redacted
    -0.79
     Hayden
    -0.79
     warrant
    -0.77
     Adams
    -0.76
     Rutherford
    -0.76
     subpoen
    -0.75
     Belichick
    -0.74
     Clapper
    -0.73
     Cheney
    -0.73
    POSITIVE LOGITS
    Vill
    1.31
    Residents
    1.26
     Rohing
    1.16
     villagers
    1.16
    Tour
    1.12
    Girls
    1.10
    Farm
    1.07
    Women
    1.07
    obyl
    1.06
    India
    1.05
    Act Density 0.589%

    No Known Activations