INDEX
Explanations
entities related to controversy or conflict, such as political figures or actions
references to political or social issues
New Auto-Interp
Negative Logits
Quantity
-0.76
ãĥĺ
-0.69
nings
-0.65
LOCK
-0.65
ãĤ±
-0.64
itive
-0.63
ellipt
-0.61
ainment
-0.60
Collider
-0.59
Valkyrie
-0.59
POSITIVE LOGITS
alike
0.99
whom
0.98
who
0.93
sembly
0.82
vae
0.77
©¶æ¥µ
0.75
conserv
0.71
oots
0.71
agna
0.70
opes
0.69
Activations Density 0.824%