INDEX
Explanations
dates in the format "Month Day" with high activation values
dates, specifically occurrences in March
New Auto-Interp
Negative Logits
cumbers
-0.76
kB
-0.69
folios
-0.68
unnecess
-0.67
lder
-0.67
cleansing
-0.65
gone
-0.64
udging
-0.64
itaire
-0.63
tox
-0.62
POSITIVE LOGITS
Madness
1.10
riage
0.90
nard
0.88
2019
0.85
ing
0.85
flower
0.85
yard
0.82
2015
0.81
Arbor
0.80
steen
0.79
Activations Density 0.021%