INDEX
    Explanations

    expressions of significant emotions or impactful experiences

    Text following sentence endings

    New Auto-Interp
    Negative Logits
     насељу
    -0.95
    MigrationBuilder
    -0.92
    kháu
    -0.86
     '\\;'
    -0.86
     Efq
    -0.85
    Portail
    -0.85
    клопе
    -0.85
    #+#
    -0.84
     Obrador
    -0.84
     neceff
    -0.83
    POSITIVE LOGITS
    </blockquote>
    0.78
    [toxicity=0]
    0.71
      
    0.59
     All
    0.59
    0.58
    </td>
    0.57
     <
    0.57
    <
    0.57
    0.57
    ↵↵
    0.57
    Act Density 0.748%

    No Known Activations