INDEX
Explanations
Spelling errors or multiple languages
The neuron spikes on tokens containing Polish letters with diacritics (ą, ć, ę, ł, ń, ś, ó, ż, ź).
words that contain specific Polish letters.
New Auto-Interp
Negative Logits
tapes
-0.07
ры
-0.07
tape
-0.06
contributes
-0.06
environments
-0.06
-------------------------------------------------------------------------
-0.06
masking
-0.06
accidentally
-0.06
prosecution
-0.06
Presentation
-0.06
POSITIVE LOGITS
exampleModalLabel
0.07
.imag
0.07
BaseService
0.06
asylum
0.06
━━━━━━━━
0.06
額
0.06
vigor
0.06
.Com
0.06
모습
0.06
Malik
0.06
Activations Density 0.015%