INDEX
    Explanations

    technical specifications and concepts

    New Auto-Interp
    Negative Logits
    Arab
    0.39
    engk
    0.38
    libr
    0.38
    ミン
    0.37
    ámica
    0.37
    Conv
    0.36
    timevals
    0.36
    0.36
    min
    0.36
    Scal
    0.36
    POSITIVE LOGITS
     rade
    0.43
     DF
    0.42
     fetish
    0.41
    ドン
    0.41
     piers
    0.41
     miracle
    0.39
     Miracle
    0.39
    доо
    0.38
     hero
    0.38
     kuj
    0.38
    Act Density 0.001%

    No Known Activations