INDEX
Explanations
instances of the word "rage" and related variations, indicating emotional intensity or outbursts
New Auto-Interp
Negative Logits
herty
-0.83
ramer
-0.78
arya
-0.74
metics
-0.72
rity
-0.69
icut
-0.67
leanor
-0.67
ournal
-0.67
icrobial
-0.67
roma
-0.67
POSITIVE LOGITS
quit
1.13
raging
0.95
rage
0.93
fury
0.92
stru
0.81
raged
0.80
against
0.79
storms
0.79
ously
0.78
vengeance
0.77
Activations Density 0.008%