INDEX
    Explanations

    periods and other punctuation marks

    period followed by capitalization

    New Auto-Interp
    Negative Logits
    .")
    
    -0.72
    .",
    
    -0.69
    $
    
    -0.68
    >",
    
    -0.68
    )");
    
    -0.65
     "");
    
    -0.65
    ...");
    
    -0.65
    !")
    
    -0.65
    "):
    
    -0.64
    '}>
    -0.63
    POSITIVE LOGITS
    Ligações
    0.42
    วม
    0.42
    ↵↵↵
    0.42
     Inscrivez
    0.40
    ↵↵↵↵
    0.40
    lichem
    0.39
    ordning
    0.39
    END
    0.39
    Hence
    0.39
     gratuits
    0.39
    Act Density 0.024%

    No Known Activations