INDEX
Explanations
references to the word "Even," indicating the neuron is detecting instances of the word "Even."
New Auto-Interp
Negative Logits
脚注の使い方
-0.87
IntoConstraints
-0.87
kaarangay
-0.85
وتسجيلات
-0.85
XmlAccessType
-0.79
linkovi
-0.79
ViewFeatures
-0.79
enumi
-0.77
bezeichneter
-0.77
Савезне
-0.75
POSITIVE LOGITS
Even
1.25
Even
1.10
EVEN
0.63
EVEN
0.59
Même
0.57
Даже
0.57
Даже
0.53
teen
0.52
смотря
0.49
hộp
0.49
Activations Density 0.006%