INDEX
    Explanations

    mentions of specific names or terms related to automobiles and chemicals

    references to specific people or entities, particularly those associated with politics or cultural issues

    New Auto-Interp
    Negative Logits
    edin
    -0.96
    ablishment
    -0.89
    anamo
    -0.87
    rat
    -0.86
    untu
    -0.86
    yo
    -0.82
    raved
    -0.81
    alg
    -0.80
    ilon
    -0.79
    amen
    -0.79
    POSITIVE LOGITS
    flame
    0.76
    ::::::::
    0.75
     flare
    0.75
    âķIJâķIJ
    0.73
     Pryor
    0.73
    Lens
    0.71
     Ago
    0.69
     [|
    0.68
    âĸ¬
    0.68
    >>>>>>>>
    0.66
    Act Density 0.046%

    No Known Activations