INDEX
    Explanations

    code aggregation

    New Auto-Interp
    Negative Logits
     pensées
    -0.08
     froide
    -0.08
     fría
    -0.08
     risques
    -0.08
     deadly
    -0.08
     tamamen
    -0.08
     menace
    -0.07
     assurances
    -0.07
    cracker
    -0.07
     abụọ
    -0.07
    POSITIVE LOGITS
     AVG
    0.11
    AVG
    0.10
     aggregated
    0.09
    Avg
    0.09
     Durchschnitt
    0.09
     average
    0.08
    平均
    0.08
    .Aggreg
    0.08
     agreg
    0.08
     avg
    0.08
    Act Density 0.004%

    No Known Activations