INDEX
Explanations
token patterns and mathematical notation related to data representation
New Auto-Interp
Negative Logits
-0.91
-0.66
<eos>
-0.65
...
-0.61
.
-0.60
today
-0.57
/\.
-0.56
,
-0.54
de
-0.52
…
-0.51
POSITIVE LOGITS
betweenstory
1.16
__':
1.13
RegressionTest
1.02
myſelf
1.01
raiſ
1.01
purpoſe
0.98
iſt
0.98
]<<"
0.97
مرئيه
0.95
Portail
0.94
Activations Density 3.020%