INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     racer
    -0.07
     cosine
    -0.07
     liquid
    -0.06
    zones
    -0.06
     gratuit
    -0.06
    15
    -0.06
    Ubergraph
    -0.06
     Ari
    -0.06
    654
    -0.06
    -0.06
    POSITIVE LOGITS
     [
    0.13
    [
    0.10
    =[
    0.10
    _[
    0.09
    :[
    0.08
    }[
    0.08
     [
    ↵
    0.08
     [\
    0.07
    0.07
    //[
    0.07
    Act Density 0.157%

    No Known Activations