INDEX
    Explanations

    phrases indicating increased difficulty or challenges

    phrases that indicate increasing difficulty or obstacles

    New Auto-Interp
    Negative Logits
    EVA
    -0.74
    ivo
    -0.71
    inate
    -0.70
    Kings
    -0.69
    reen
    -0.69
    abet
    -0.68
    Saint
    -0.67
     Charity
    -0.67
    Liber
    -0.66
     Gazette
    -0.64
    POSITIVE LOGITS
     than
    1.22
     Than
    1.01
     harder
    0.85
     behaved
    0.82
     compr
    0.76
     forgiving
    0.76
     resil
    0.74
    nces
    0.74
     "$:/
    0.74
    ModLoader
    0.73
    Act Density 0.012%

    No Known Activations