INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     NavController
    -0.59
    tidae
    -0.57
    temberg
    -0.56
    خاذ
    -0.56
    vka
    -0.54
    hütte
    -0.54
    Kat
    -0.52
    entu
    -0.52
     BAND
    -0.51
    ahuila
    -0.51
    POSITIVE LOGITS
     surprise
    1.68
     Surprise
    1.65
     surprised
    1.48
     surprises
    1.45
    surprise
    1.42
    Surprise
    1.40
     overras
    1.34
     surpris
    1.29
    surprised
    1.28
     surpresa
    1.28
    Act Density 0.064%

    No Known Activations