INDEX
Explanations
words related to challenging or defying authority or societal norms
instances of the word "dare" in various contexts
New Auto-Interp
Negative Logits
urgy
-0.79
Rite
-0.74
effic
-0.70
ulator
-0.69
iple
-0.68
ulatory
-0.67
ulators
-0.67
winner
-0.64
rator
-0.64
VERTISEMENT
-0.63
POSITIVE LOGITS
dare
1.08
ngth
0.96
Dare
0.93
defy
0.88
daring
0.87
dared
0.84
provoke
0.78
boldly
0.77
roam
0.75
aspire
0.73
Activations Density 0.019%