INDEX
    Explanations

    specific structural elements and formatting in text

    New Auto-Interp
    Negative Logits
    تقاوى
    -0.64
     cherchés
    -0.56
    haikusbot
    -0.55
     senhora
    -0.54
    SharedDtor
    -0.53
     betweenstory
    -0.51
     téléphonique
    -0.50
     Infórmanos
    -0.50
    ArgsConstructor
    -0.49
    -0.49
    POSITIVE LOGITS
    ↵↵↵
    0.81
    ↵↵↵↵↵
    0.69
    ↵↵↵↵
    0.67
    ↵↵↵↵↵↵↵
    0.65
    ↵↵↵↵↵↵↵↵↵
    0.60
    ↵↵↵↵↵↵
    0.60
    ↵↵↵↵↵↵↵↵
    0.59
    ↵↵↵↵↵↵↵↵↵↵↵
    0.58
     Gros
    0.53
    uz
    0.52
    Act Density 0.011%

    No Known Activations