INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     созда
    1.05
    naires
    0.95
    ный
    0.86
    cence
    0.84
    the
    0.82
     наи
    0.82
    ىڭ
    0.76
     постро
    0.75
     consegu
    0.74
    <start_of_turn>
    0.73
    POSITIVE LOGITS
     cheese
    1.38
     Cheese
    1.29
    Cheese
    1.19
    -
    1.17
     cheeses
    1.12
     I
    1.11
    cheese
    1.10
    ل
    1.09
    ur
    1.08
    チーズ
    1.08
    Act Density 0.006%

    No Known Activations