INDEX
Explanations
dates written in a specific format (e.g., Month/Day/Year)
punctuation marks and certain phrases in text formats, likely within dates or file references
New Auto-Interp
Negative Logits
ctrl
-0.77
ELF
-0.75
glim
-0.74
BIL
-0.70
byss
-0.69
withd
-0.67
requ
-0.65
BAT
-0.64
TW
-0.64
elf
-0.64
POSITIVE LOGITS
2010
0.99
2012
0.97
2011
0.96
2014
0.96
2008
0.95
2013
0.94
2009
0.94
2006
0.93
1998
0.91
2007
0.88
Activations Density 0.098%