INDEX
Explanations
references to secrets and hidden information
New Auto-Interp
Negative Logits
ünkü
-0.15
eo
-0.15
OPTIONS
-0.13
ÙĪÙĨد
-0.13
562
-0.13
ayo
-0.13
Pragma
-0.13
acia
-0.13
же
-0.13
359
-0.12
POSITIVE LOGITS
secret
0.77
secrets
0.73
Secret
0.60
secret
0.60
Secrets
0.58
ç§ĺ
0.57
-secret
0.56
Secret
0.56
SECRET
0.55
secre
0.53
Activations Density 0.196%