INDEX
Explanations
elements like code snippets or commands related to programming
Code, file paths, and programming-related text
formatting and document structure
New Auto-Interp
Negative Logits
itſelf
-0.74
ArrowToggle
-0.65
Specifiche
-0.65
LEGGI
-0.64
Vikipedi
-0.63
請繼續往下閱讀
-0.63
ांकि
-0.62
dollis
-0.62
ſelves
-0.61
Gambas
-0.60
POSITIVE LOGITS
<eos>
1.13
</b>
0.65
<h1>
0.64
<h2>
0.64
');
0.64
");
0.62
});
0.61
↵↵
0.60
↵↵↵↵
0.59
↵↵↵↵↵
0.59
Activations Density 0.974%