INDEX
    Explanations

    detecting differences or variety

    New Auto-Interp
    Negative Logits
     область
    0.51
     technique
    0.44
    ენა
    0.43
     Fläche
    0.42
     χώρα
    0.42
    																								
    0.42
     elementy
    0.41
     кнопки
    0.41
    тип
    0.41
    ότητα
    0.40
    POSITIVE LOGITS
     environments
    1.39
     setups
    1.38
     contexts
    1.31
     climates
    1.29
     regimes
    1.27
     histories
    1.27
     formats
    1.25
     universes
    1.23
     styles
    1.20
     atmospheres
    1.20
    Act Density 0.539%

    No Known Activations