INDEX
    Explanations

    quantities and comparisons

    New Auto-Interp
    Negative Logits
    eventType
    0.41
    vartheta
    0.40
     жа
    0.39
     संबोध
    0.38
     deviate
    0.38
    altungen
    0.38
    мога
    0.37
     સંદેશ
    0.37
     જા
    0.37
     narration
    0.37
    POSITIVE LOGITS
    一副
    0.40
    0.39
    itute
    0.38
    unoscut
    0.38
     çant
    0.38
    0.37
    一身
    0.37
     scarf
    0.37
    otch
    0.37
    Fechar
    0.37
    Act Density 0.001%

    No Known Activations