INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Baton
    -0.07
     жиз
    -0.07
     cycle
    -0.07
     Treatment
    -0.07
     Tabs
    -0.06
     Mind
    -0.06
     Ibn
    -0.06
     minors
    -0.06
     ngữ
    -0.06
     말씀
    -0.06
    POSITIVE LOGITS
    -based
    0.08
    -Based
    0.08
    ]]></
    0.08
    based
    0.07
     Reads
    0.07
     країни
    0.07
     실제
    0.07
     издел
    0.06
     právě
    0.06
    cheap
    0.06
    Act Density 0.013%

    No Known Activations