INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     које
    0.86
     ovat
    0.80
     jiné
    0.78
     ktoré
    0.78
     koje
    0.78
     которое
    0.78
     které
    0.76
     ஆகியவை
    0.76
     vhodné
    0.75
     ellas
    0.74
    POSITIVE LOGITS
     wasn
    1.45
    1.42
    '
    1.36
     hasn
    1.30
     hadn
    1.23
     himself
    1.21
     was
    1.18
     knows
    1.14
     knew
    1.13
     didn
    1.10
    Act Density 0.148%

    No Known Activations