INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    341
    -0.08
     Scrap
    -0.08
     charts
    -0.08
     theft
    -0.08
    963
    -0.07
     ANT
    -0.07
    	File
    -0.07
     costumes
    -0.07
     Archivo
    -0.07
    ف
    -0.07
    POSITIVE LOGITS
    Sentence
    0.12
    sentence
    0.11
     Sentence
    0.11
     sentence
    0.11
     sentences
    0.11
    一句
    0.10
    句话
    0.10
    _sentence
    0.10
     четыр
    0.09
     count
    0.09
    Act Density 0.016%

    No Known Activations