INDEX
    Explanations

    This neuron detects occurrences of the word “fun.”

    New Auto-Interp
    Negative Logits
     очі
    -0.07
     зрост
    -0.07
    loyd
    -0.07
    serir
    -0.07
    zie
    -0.07
    BILL
    -0.07
    woord
    -0.07
     heavy
    -0.07
    CTR
    -0.06
     overd
    -0.06
    POSITIVE LOGITS
     fun
    0.17
     Fun
    0.16
    Fun
    0.11
    fun
    0.10
     FUN
    0.09
    FUN
    0.09
     grátis
    0.08
    	fun
    0.07
     Fut
    0.07
    0.07
    Act Density 0.017%

    No Known Activations