INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    пле
    0.44
    体に
    0.42
     théorème
    0.41
    0.41
    𒌅
    0.41
    тьяна
    0.39
    طع
    0.38
     grassland
    0.38
     জিজ্ঞাস
    0.38
     الجديد
    0.37
    POSITIVE LOGITS
    initialized
    0.66
    self
    0.60
     _
    0.59
    initialize
    0.57
    is
    0.57
     init
    0.56
     initialized
    0.56
    init
    0.55
     current
    0.54
     Initialize
    0.53
    Act Density 0.060%

    No Known Activations