INDEX
Explanations
instances of the word "glad" within the text
expressions of gratitude or contentment
New Auto-Interp
Negative Logits
classified
-0.78
$$$$
-0.71
Saharan
-0.67
improve
-0.66
artifacts
-0.65
heat
-0.64
irrel
-0.64
effic
-0.64
UE
-0.63
IDs
-0.62
POSITIVE LOGITS
glad
1.30
thankful
0.86
happy
0.84
ness
0.81
sorry
0.79
withstanding
0.79
itudinal
0.79
grateful
0.79
pardon
0.76
thanking
0.74
Activations Density 0.006%