INDEX
Explanations
terms related to causing or provoking intense emotions or actions
terms related to incitement and provocation, particularly in political or social contexts
New Auto-Interp
Negative Logits
ichick
-0.79
©¶æ
-0.72
framework
-0.71
ellipt
-0.68
Tunnel
-0.65
neau
-0.65
iddler
-0.65
Accounting
-0.64
ighth
-0.62
McCl
-0.62
POSITIVE LOGITS
itement
0.98
inciting
0.95
xual
0.89
sidx
0.80
Demand
0.78
incite
0.78
Spread
0.77
ãĥĥ
0.76
ãĥŁ
0.76
cele
0.76
Activations Density 0.034%