INDEX
Explanations
dates in the format of month/day/year with high activation values
dates and timestamps
New Auto-Interp
Negative Logits
advertisement
-0.76
owler
-0.71
unpredictable
-0.65
acebook
-0.63
avorite
-0.63
unpredict
-0.63
bledon
-0.62
uters
-0.62
matically
-0.62
izoph
-0.61
POSITIVE LOGITS
RELEASE
1.02
GMT
0.93
Became
0.87
partName
0.86
Updated
0.82
Apply
0.77
08
0.76
07
0.74
EDIT
0.72
UTC
0.72
Activations Density 0.072%