INDEX
Explanations
dates in various formats
segments of text related to updates or revisions, particularly those formatted with timestamps
New Auto-Interp
Negative Logits
mustard
-0.65
grop
-0.64
Butcher
-0.63
ļéĨĴ
-0.62
creep
-0.61
voic
-0.60
chants
-0.57
Gavin
-0.57
Mellon
-0.56
Chop
-0.56
POSITIVE LOGITS
Aug
1.36
Feb
1.35
31
1.30
22
1.25
30
1.24
28
1.24
26
1.23
29
1.23
27
1.22
Apr
1.21
Activations Density 0.018%