INDEX
    Explanations

    mentions of political ideologies and leaders

    New Auto-Interp
    Negative Logits
    eret
    -0.67
    hett
    -0.56
    hiba
    -0.56
    Downloadha
    -0.55
    afety
    -0.55
    ecause
    -0.55
    jri
    -0.54
    atche
    -0.54
    ogs
    -0.54
    accompan
    -0.54
    POSITIVE LOGITS
     Centauri
    0.68
    autical
    0.66
    omaly
    0.65
    wered
    0.62
    Ü
    0.59
    axis
    0.57
     Catalyst
    0.57
    âĶĢâĶĢâĶĢâĶĢ
    0.57
    enment
    0.53
    ALLY
    0.52
    Act Density 6.150%

    No Known Activations