INDEX
Explanations
references to weekends or weekend-related activities
New Auto-Interp
Negative Logits
ifice
-0.93
rity
-0.87
ACTED
-0.82
ymes
-0.81
umbing
-0.80
atche
-0.79
metics
-0.79
ronics
-0.78
ocate
-0.78
efficients
-0.77
POSITIVE LOGITS
mornings
1.22
nights
1.12
evenings
1.03
afternoon
1.03
brunch
0.99
weekend
0.97
evening
0.92
night
0.92
morning
0.92
night
0.90
Activations Density 0.011%