INDEX
    Explanations

    phrases related to specific concepts or objects, potentially with negative connotations

    metaphorical expressions that imply deception, manipulation, or undesirable outcomes

    New Auto-Interp
    Negative Logits
    URA
    -0.79
    cont
    -0.78
    rongh
    -0.77
    alle
    -0.76
    ãĤ¼ãĤ¦ãĤ¹
    -0.73
    ickets
    -0.72
    qus
    -0.71
    Rail
    -0.71
    rab
    -0.71
    arel
    -0.70
    POSITIVE LOGITS
     mentality
    1.04
     scenario
    0.99
     approach
    0.98
     tactic
    0.92
     moment
    0.86
     situation
    0.85
     solution
    0.85
     fallacy
    0.84
     maneuver
    0.82
     tactics
    0.82
    Act Density 0.378%

    No Known Activations