INDEX
    Explanations

    references to author names and their associated works in academic contexts

    New Auto-Interp
    Negative Logits
     Vikipedi
    -0.84
     Reſ
    -0.81
     greateſt
    -0.78
     pleaſure
    -0.77
    VersionUID
    -0.77
     الدولى
    -0.76
     ModelExpression
    -0.76
     kasarigan
    -0.76
     كومونز
    -0.76
     fidé
    -0.75
    POSITIVE LOGITS
    ,
    0.63
    .
    0.62
    0.60
    0.57
    ↵↵
    0.57
    ",
    0.57
     ${
    0.56
     a
    0.56
     "
    0.55
    ).
    0.54
    Act Density 0.168%

    No Known Activations