INDEX
    Explanations

    phrases containing the special character 'Ċ'

    expressions of dissatisfaction or criticism towards various aspects of society or behavior

    New Auto-Interp
    Negative Logits
     occas
    -0.95
     eleph
    -0.88
     oun
    -0.86
     exting
    -0.83
     exha
    -0.83
     neighb
    -0.82
     tremend
    -0.78
    aditional
    -0.78
     newcom
    -0.77
     citiz
    -0.75
    POSITIVE LOGITS
    Therefore
    0.94
    ³³³³³³³³
    0.91
    ³³³³³³³³³³³³³³³³
    0.87
    Furthermore
    0.85
    ³³³³
    0.84
    Personally
    0.84
    Likewise
    0.84
    Learn
    0.83
    0.81
    Sadly
    0.80
    Act Density 0.718%

    No Known Activations