INDEX
Explanations
dates in a specific format
occurrences of the word "Updated" likely indicating updates or revisions in content
New Auto-Interp
Negative Logits
stood
-0.78
bors
-0.76
phal
-0.74
ften
-0.74
filled
-0.72
acht
-0.72
pity
-0.71
aden
-0.70
ciplinary
-0.70
vo
-0.68
POSITIVE LOGITS
Updated
0.91
CLASSIFIED
0.88
Posted
0.80
Updates
0.77
Update
0.76
Published
0.72
UPDATE
0.71
Attempts
0.70
Update
0.70
ieval
0.70
Activations Density 0.007%