INDEX
    Explanations

    the term "first" in various contexts

    New Auto-Interp
    Negative Logits
     pleaſure
    -0.89
     purpoſe
    -0.88
    RenderAtEndOf
    -0.88
    hoeddwyd
    -0.81
    ſelf
    -0.80
     greateſt
    -0.79
     ſtate
    -0.78
     houſe
    -0.77
     Ninth
    -0.76
     Sixth
    -0.71
    POSITIVE LOGITS
    ViewFeatures
    0.67
    ніципалі
    0.57
     [*]
    0.56
    oxone
    0.55
    stalt
    0.55
    tokenizer
    0.54
    cidae
    0.54
    IBILITIES
    0.52
     &___
    0.49
    mobileqq
    0.49
    Act Density 0.042%

    No Known Activations