INDEX
Explanations
nothing, as it did not activate on any tokens in the examined text
New Auto-Interp
Negative Logits
content
-0.49
<eos>
-0.46
}{|-0.45
YLE
-0.41
oran
-0.40
gere
-0.39
unci
-0.38
sau
-0.38
Cox
-0.38
rec
-0.37
POSITIVE LOGITS
Personensuche
1.25
RegressionTest
1.19
bewerken
1.16
MigrationBuilder
1.15
beginnetje
1.09
<bos>
1.09
tagHelperRunner
1.02
betweenstory
1.02
Majefty
1.00
Савезне
0.99
Activations Density 0.096%