INDEX
    Explanations

    controversial topics or subjects

    references to controversial subjects

    New Auto-Interp
    Negative Logits
    abiding
    -0.85
    vation
    -0.83
    abetic
    -0.77
    thia
    -0.75
    nings
    -0.75
    ILA
    -0.74
    á
    -0.73
    abet
    -0.72
    ruary
    -0.71
    united
    -0.71
    POSITIVE LOGITS
     topic
    0.99
     topics
    0.95
     aspects
    0.93
     proposition
    0.91
     aspect
    0.90
     propositions
    0.89
     decisions
    0.88
     opinions
    0.86
     remarks
    0.84
     controversial
    0.84
    Act Density 0.073%

    No Known Activations