INDEX
    Explanations

    religious terms and pronouns

    New Auto-Interp
    Negative Logits
     indicating
    0.37
     Parser
    0.37
     instruction
    0.36
     instructing
    0.36
     instructions
    0.34
    ych
    0.34
     prose
    0.34
     instructs
    0.33
    太郎
    0.33
     C
    0.32
    POSITIVE LOGITS
    Nor
    0.39
     نعمت
    0.36
     Uniwers
    0.36
    ပဲ
    0.36
    Veja
    0.35
    設備の
    0.35
     sahip
    0.35
     பாஜக
    0.35
    Você
    0.34
     Số
    0.34
    Act Density 0.001%

    No Known Activations