INDEX
    Explanations

    political opinions

    New Auto-Interp
    Negative Logits
    enth
    -0.07
     Simulation
    -0.07
    -0.06
    к
    -0.06
    zz
    -0.06
    orary
    -0.06
    cmp
    -0.06
    とした
    -0.06
    -0.06
    ora
    -0.06
    POSITIVE LOGITS
    .Navigate
    0.08
    ):?>↵
    0.06
     reels
    0.06
    uelve
    0.06
    .embedding
    0.06
     мар
    0.06
    ?<
    0.06
    _BR
    0.06
    	INNER
    0.06
    )>=
    0.06
    Act Density 0.057%

    No Known Activations