INDEX
Explanations
expressions of personal concern and uncertainty
Follows single characters or short strings
swear words, expletives, and punctuation
New Auto-Interp
Negative Logits
Paglinawan
-1.02
NameInMap
-0.92
<bos>
-0.91
`;
-0.89
'},
-0.88
>`;
-0.88
كومونز
-0.85
GenerationType
-0.85
*/
-0.84
[])
-0.83
POSITIVE LOGITS
.
1.07
fucking
0.89
,
0.86
!
0.83
…
0.79
freakin
0.76
fuckin
0.73
FUCKING
0.69
stuff
0.67
….
0.66
Activations Density 0.534%