INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     stabile
    -0.09
     философ
    -0.08
     Philosoph
    -0.08
    rochen
    -0.08
     humaine
    -0.08
     mankind
    -0.07
     Greeks
    -0.07
     philosoph
    -0.07
    ơi
    -0.07
    wania
    -0.07
    POSITIVE LOGITS
     Classroom
    0.10
    Rnd
    0.08
     leaderboard
    0.08
    Leaderboard
    0.08
     classroom
    0.08
    Contest
    0.08
    Lecture
    0.08
    Scratch
    0.08
    _customize
    0.08
    CRT
    0.08
    Act Density 0.003%

    No Known Activations