INDEX
Explanations
mentions of rat poison or rats in various contexts
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1271
+0.16
0.7%
663
+0.16
0.7%
1520
+0.14
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
663
+0.16
0.02
1520
+0.16
0.02
437
+0.14
0.01
Negative Logits
jabi
-0.51
dillera
-0.48
FFEE
-0.46
كومونز
-0.45
khu
-0.43
SM
-0.43
Saba
-0.42
inghouse
-0.42
refour
-0.41
tartalomajánló
-0.41
POSITIVE LOGITS
Rat
1.39
rat
1.36
Rat
1.30
Rats
1.25
Rats
1.20
rats
1.19
rat
1.09
RAT
1.03
RAT
0.95
rats
0.87
Activations Density 0.076%