INDEX
    Explanations

    sentences ending with "said"

    special characters or symbols in the text

    New Auto-Interp
    Negative Logits
     decomp
    -0.85
     Marble
    -0.76
     Myster
    -0.75
     Discord
    -0.73
     Nept
    -0.72
     Voyager
    -0.71
     Manhattan
    -0.70
     clutter
    -0.70
     gray
    -0.69
     warp
    -0.68
    POSITIVE LOGITS
    ¬
    0.97
    ¹
    0.97
    £
    0.96
    į
    0.95
    Asia
    0.92
    ais
    0.92
    Į
    0.90
    AFP
    0.90
    Iraq
    0.87
    ech
    0.87
    Act Density 0.427%

    No Known Activations