INDEX
Explanations
references to tokens and token-related operations
New Auto-Interp
Negative Logits
Shakspeare
-0.94
Hec
-0.88
Monfieur
-0.88
Pyrr
-0.88
GLS
-0.85
screenWidth
-0.84
Diſ
-0.83
Jefus
-0.82
HAP
-0.82
Purdy
-0.82
POSITIVE LOGITS
token
2.12
tokens
2.07
Token
1.98
token
1.88
Token
1.83
TOKEN
1.76
tokens
1.72
Tokens
1.72
Tokens
1.67
TOKEN
1.63
Activations Density 0.044%