INDEX
    Explanations

    references to regularity and consistency in various contexts

    New Auto-Interp
    Negative Logits
    quiv
    -0.15
    levator
    -0.14
    redi
    -0.14
    elic
    -0.14
    etu
    -0.14
    etros
    -0.14
    erial
    -0.14
    vail
    -0.13
    oft
    -0.13
    rogen
    -0.13
    POSITIVE LOGITS
    ity
    0.50
    s
    0.39
    ized
    0.36
    ities
    0.35
    ily
    0.34
    ised
    0.32
    ITY
    0.30
    isation
    0.30
    izing
    0.30
    ization
    0.28
    Act Density 0.026%

    No Known Activations