INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    index
    1.00
    Index
    0.89
    Office
    0.86
    introduction
    0.84
    Introduction
    0.84
    README
    0.82
    Overview
    0.81
    detail
    0.81
    Category
    0.81
    office
    0.79
    POSITIVE LOGITS
     이름을
    0.97
     부여
    0.94
     paranoia
    0.94
    َر
    0.94
     방식
    0.92
     Diffusion
    0.90
    0.89
    0.89
    sthresh
    0.88
    0.88
    Act Density 0.042%

    No Known Activations