INDEX
Explanations
texts following a pattern of special characters appearing in pairs in the middle of the text
topics related to various applications and technologies
New Auto-Interp
Negative Logits
hement
-0.86
sing
-0.69
leigh
-0.66
tered
-0.66
dn
-0.66
lected
-0.65
boro
-0.65
acan
-0.62
HOU
-0.61
hest
-0.60
POSITIVE LOGITS
ccording
1.04
³³³³³³³³³³³³³³³³
0.83
³³³³³³³³
0.79
³³³
0.69
Similar
0.67
Prosecutors
0.61
Despite
0.61
Originally
0.60
³³³³
0.60
Throughout
0.60
Activations Density 0.163%