INDEX
Explanations
instances where daring or challenging actions are described
occurrences of the word "dare" in various contexts
New Auto-Interp
Negative Logits
urgy
-0.79
ensive
-0.75
effic
-0.75
ĻĤ
-0.70
division
-0.69
ounced
-0.64
Methods
-0.62
anesthesia
-0.62
edged
-0.62
Tool
-0.61
POSITIVE LOGITS
Dare
1.35
dare
1.20
daring
0.77
dared
0.76
boldly
0.74
provoke
0.72
censor
0.70
cot
0.68
geon
0.67
defy
0.67
Activations Density 0.008%