INDEX
    Explanations

    phrases that indicate time or location-related references

    New Auto-Interp
    Negative Logits
    vert
    -0.16
    ейн
    -0.14
    imit
    -0.14
    opal
    -0.14
    zd
    -0.14
    istor
    -0.14
    aura
    -0.14
    582
    -0.14
    üz
    -0.14
     Eug
    -0.14
    POSITIVE LOGITS
     scale
    0.19
     levels
    0.18
    elic
    0.17
    every
    0.15
     grass
    0.15
     stake
    0.15
     Scale
    0.15
    levels
    0.15
     scales
    0.15
     every
    0.15
    Act Density 0.071%

    No Known Activations