INDEX
    Explanations

    abbreviations and prefixes

    New Auto-Interp
    Negative Logits
    ка
    3.68
    лно
    3.59
    ل
    3.55
    l
    3.52
    3.33
    en
    3.27
    ená
    3.09
    лни
    3.08
    ר
    3.04
    лна
    2.97
    POSITIVE LOGITS
    NOWLEDG
    2.65
    !\!\
    2.42
    ainte
    2.35
     armada
    2.33
     κι
    2.29
    2.28
    2.25
    ','"+
    2.24
    𝔰
    2.21
    𝐫
    2.21
    Act Density 0.043%

    No Known Activations