INDEX
    Explanations

    classifying with descriptions and attributes

    New Auto-Interp
    Negative Logits
    0.40
    цію
    0.39
    0.38
    0.38
    ։
    0.38
    ®.
    0.37
    .”
    0.37
    ंग
    0.37
    .`
    0.37
    孩子的
    0.37
    POSITIVE LOGITS
     certes
    0.36
    ={},
    0.36
    \%,
    0.35
    captivity
    0.35
    '=>'','
    0.34
    ",[],
    0.34
     zwar
    0.33
    0.31
     {},
    0.31
    =?,
    0.31
    Act Density 0.179%

    No Known Activations