INDEX
    Explanations

    punctuations related to dialogue or quotations

    New Auto-Interp
    Negative Logits
    ":
    
    -1.33
    "…
    -1.30
    ”:
    -1.28
    )”.
    -1.26
    -1.26
    ".
    
    -1.26
    .,"
    -1.26
    "
    
    -1.25
    ”…
    -1.24
    ”).
    -1.24
    POSITIVE LOGITS
    .
    0.55
     there
    0.53
    uidado
    0.53
     no
    0.52
     and
    0.52
     Her
    0.51
     around
    0.50
     with
    0.48
     culturelles
    0.48
     here
    0.48
    Act Density 0.129%

    No Known Activations