INDEX
    Explanations

    language, nationality, recipes

    New Auto-Interp
    Negative Logits
     Disclosure
    0.36
     Preferred
    0.34
     Ghosts
    0.34
     κάπο
    0.33
     magari
    0.33
     Landmarks
    0.32
     Universal
    0.32
     intranet
    0.32
     Cred
    0.32
     Drivers
    0.32
    POSITIVE LOGITS
    フランス
    0.42
     francese
    0.42
    0.37
    contenido
    0.36
     फ्रांस
    0.35
     பெ
    0.35
     পাকিস্তানের
    0.35
    レシ
    0.35
    BI
    0.35
     mexicano
    0.35
    Act Density 0.148%

    No Known Activations