INDEX
Explanations
punctuation and formatting elements typically found in code
New Auto-Interp
Negative Logits
vida
-0.15
fucking
-0.14
âĨij
-0.14
.integration
-0.14
//_
-0.13
.âĢ¢
-0.13
Fucking
-0.13
Barb
-0.13
("-0.13
^↵
-0.13
POSITIVE LOGITS
/**↵
0.53
/**↵
0.49
/**
0.33
/**č↵
0.30
/**
0.27
/**↵↵
0.23
/**č↵
0.23
/**<
0.22
**/↵
0.21
///
0.21
Activations Density 0.125%