INDEX
    Explanations

    references to numerical values and their representations

    New Auto-Interp
    Negative Logits
     فريبيس
    -1.21
    tvguidetime
    -1.09
     myſelf
    -1.05
     itſelf
    -1.01
     pleaſure
    -1.00
     للمعارف
    -1.00
     purpoſe
    -0.97
    AutoScaleMode
    -0.95
     Chriftian
    -0.94
     ujednoznacz
    -0.93
    POSITIVE LOGITS
     (
    0.50
    0.49
     au
    0.48
     Ch
    0.46
    (
    0.45
    0.44
     h
    0.44
    final
    0.44
    Au
    0.43
     final
    0.43
    Act Density 0.141%

    No Known Activations