INDEX
    Explanations

    the presence of numerical values, particularly in the context of structured or formatted data

    "become" or "instruction"

    New Auto-Interp
    Negative Logits
      
    -0.99
       
    -0.79
    ↵↵
    -0.68
    -0.64
    ↵↵↵
    -0.62
     ,
    -0.61
     con
    -0.59
        
    -0.58
    -0.57
         
    -0.57
    POSITIVE LOGITS
    1.09
     nakalista
    1.06
    ſelves
    0.96
    Vidite
    0.93
    0.93
    DockStyle
    0.92
    tanleria
    0.92
     iſt
    0.91
     ſind
    0.91
     $_(
    0.90
    Act Density 0.166%

    No Known Activations