INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     sehr
    0.63
     perfetto
    0.61
     extremadamente
    0.59
     estremamente
    0.58
     perfecta
    0.57
     extrêmement
    0.57
    非常
    0.55
     дуже
    0.55
     bardzo
    0.55
     perfekt
    0.55
    POSITIVE LOGITS
     meaningful
    0.96
     genuinely
    0.94
     meaningfully
    0.90
     genuine
    0.80
    实际
    0.79
     reasonably
    0.78
     실제로
    0.76
     actual
    0.76
     actually
    0.75
     thoughtfully
    0.73
    Act Density 0.045%

    No Known Activations