INDEX
Explanations
common sentence completions
formatting and structural cues in prompts and dialogues, such as section labels, list items, numbering, and emphasized elements
New Auto-Interp
Negative Logits
основные
0.55
отдельные
0.54
長期
0.52
൭
0.49
संस्थ
0.49
repert
0.49
વિભાગ
0.48
மாவட்ட
0.47
collectivités
0.47
กลุ่ม
0.46
POSITIVE LOGITS
chicken
0.71
chocolate
0.71
cheese
0.69
Cheese
0.68
cows
0.67
pizza
0.65
Pokemon
0.65
beer
0.64
joke
0.63
bacon
0.63
Activations Density 0.101%