INDEX
    Explanations

    uncertainty and inquiries about knowledge or understanding

    New Auto-Interp
    Negative Logits
    TagMode
    -0.69
    ſelf
    -0.68
     gynhyrchwyd
    -0.67
    featureID
    -0.66
     queſta
    -0.66
    SharedDtor
    -0.65
     ligiloj
    -0.65
     Houſe
    -0.64
     surla
    -0.63
    itinéraire
    -0.63
    POSITIVE LOGITS
     what
    1.70
    what
    1.30
    What
    1.27
     What
    1.23
     WHAT
    1.07
    WHAT
    0.95
     whats
    0.88
    whats
    0.83
     hvad
    0.76
     hva
    0.75
    Act Density 0.047%

    No Known Activations