INDEX
    Explanations

    proper nouns related to brands, titles, or notable individuals

    New Auto-Interp
    Negative Logits
    Datuak
    -0.48
    WriteLiteral
    -0.42
    <eos>
    -0.37
     cara
    -0.35
    ────
    -0.34
    ...,
    -0.34
    RuleContext
    -0.34
    subsection
    -0.34
     gynhyrchwyd
    -0.34
     salu
    -0.33
    POSITIVE LOGITS
     zijne
    0.59
     kasarigan
    0.58
     dezelve
    0.54
    0.53
     zoude
    0.53
     tambi
    0.53
    MessageTagHelper
    0.52
     nemlig
    0.50
    hæng
    0.50
     zelve
    0.50
    Act Density 0.020%

    No Known Activations