INDEX
    Explanations

    common sentence completions

    formatting and structural cues in prompts and dialogues, such as section labels, list items, numbering, and emphasized elements

    New Auto-Interp
    Negative Logits
     основные
    0.55
     отдельные
    0.54
    長期
    0.52
    0.49
     संस्थ
    0.49
     repert
    0.49
     વિભાગ
    0.48
     மாவட்ட
    0.47
     collectivités
    0.47
    กลุ่ม
    0.46
    POSITIVE LOGITS
     chicken
    0.71
     chocolate
    0.71
     cheese
    0.69
     Cheese
    0.68
     cows
    0.67
     pizza
    0.65
     Pokemon
    0.65
     beer
    0.64
     joke
    0.63
     bacon
    0.63
    Act Density 0.101%

    No Known Activations