INDEX
    Explanations

    phrases indicating strength or certainty

    New Auto-Interp
    Negative Logits
    ly
    -0.17
    /layouts
    -0.16
    ettle
    -0.15
    ooks
    -0.15
    errupt
    -0.14
    olut
    -0.14
    наÑĩе
    -0.14
    алÑĸз
    -0.14
    нев
    -0.14
    rij
    -0.14
    POSITIVE LOGITS
     indeed
    0.56
     fact
    0.48
     actually
    0.43
     inf
    0.40
     Indeed
    0.39
    Indeed
    0.37
    inde
    0.37
    fact
    0.35
     totiž
    0.35
    actually
    0.35
    Act Density 0.049%

    No Known Activations