INDEX
Explanations
explaining or giving examples
requests and content related to programming/code (including code blocks and technical prompts), often spiking around conversation turn-boundary tokens.
New Auto-Interp
Negative Logits
ওই
0.44
apparently
0.38
apparently
0.36
OffsetY
0.35
vendar
0.35
miatt
0.33
arginine
0.33
RefreshToken
0.33
회가
0.32
নাকি
0.32
POSITIVE LOGITS
பொதுவாக
0.52
สำหรับ
0.51
সাধারণত
0.49
สำหรับ
0.47
Examples
0.46
Для
0.45
다양한
0.43
Typically
0.43
Descripción
0.43
Beispiele
0.42
Activations Density 0.406%