INDEX
    Explanations

    research and technical texts

    New Auto-Interp
    Negative Logits
    homme
    -0.07
    -0.07
    emploi
    -0.06
    ENDOR
    -0.06
     Uncategorized
    -0.06
     accredited
    -0.06
     навіть
    -0.06
    /open
    -0.06
    \admin
    -0.06
    ausal
    -0.06
    POSITIVE LOGITS
    _con
    0.07
     scho
    0.06
    ]</
    0.06
    她们
    0.06
     konk
    0.06
     sporting
    0.06
     Possible
    0.06
    						    
    0.06
     flirting
    0.06
    0.06
    Act Density 0.291%

    No Known Activations