INDEX
Explanations
keywords related to time, specifically referring to a particular week
repeated mentions of the word "week" to indicate time references in news
New Auto-Interp
Negative Logits
itialized
-0.68
Gems
-0.68
bler
-0.67
izoph
-0.64
Compan
-0.62
ventus
-0.62
comr
-0.60
vec
-0.59
ancies
-0.59
amate
-0.59
POSITIVE LOGITS
days
1.02
night
1.01
afternoon
0.94
morning
0.81
evening
0.81
mornings
0.78
flower
0.75
night
0.72
vernight
0.72
morning
0.71
Activations Density 0.045%