INDEX
    Explanations

    phrases indicating complete exclusion or avoidance

    the word "altogether" in various contexts

    New Auto-Interp
    Negative Logits
     Patriarch
    -0.71
     Cascade
    -0.68
    hao
    -0.65
    nan
    -0.63
    yers
    -0.62
     Fulton
    -0.62
    yer
    -0.62
     Apache
    -0.61
     Yar
    -0.59
     Minute
    -0.59
    POSITIVE LOGITS
    loo
    0.79
     disarm
    0.74
    ADRA
    0.68
    olkien
    0.65
    ICA
    0.65
    gebra
    0.65
    EngineDebug
    0.65
    æ³
    0.64
    Iterator
    0.64
    Decre
    0.64
    Act Density 0.010%

    No Known Activations