INDEX
    Explanations

    emphasized differs, way to

    New Auto-Interp
    Negative Logits
    ک
    0.65
    0.47
    0.45
     Continuation
    0.45
     Percent
    0.44
    特徴
    0.44
    基本
    0.44
     Copenhagen
    0.43
    0.43
    希望
    0.43
    POSITIVE LOGITS
    IDEA
    0.45
     الشيء
    0.44
     cuadr
    0.43
    NGTH
    0.43
    A
    0.43
    Ţ
    0.43
    takes
    0.42
     adored
    0.42
     बचाता
    0.42
     የሆነ
    0.41
    Act Density 0.001%

    No Known Activations