INDEX
    Explanations

    listing and explanations that follow

    New Auto-Interp
    Negative Logits
     vragen
    0.42
     अनुमान
    0.40
     спраши
    0.39
     мои
    0.39
     رؤ
    0.38
     otáz
    0.38
    告诉我
    0.38
     longueur
    0.38
     அவ்வ
    0.38
    假设
    0.38
    POSITIVE LOGITS
     includes
    0.66
     Includes
    0.65
     along
    0.63
     including
    0.59
    includes
    0.59
    包括
    0.57
    Includes
    0.56
    including
    0.54
     inclu
    0.54
     включа
    0.54
    Act Density 0.003%

    No Known Activations