INDEX
Explanations
mentions of drunkenness or related behavior in police reports
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
47
+0.08
0.3%
920
+0.07
0.2%
650
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
12
+0.08
0.03
47
+0.07
0.03
1437
+0.07
0.03
Negative Logits
<bos>
-1.07
లాలు
-0.76
/**
-0.66
//
-0.66
public
-0.65
class
-0.65
بتاريخ
-0.64
/**
-0.63
/*
-0.63
HasAnnotation
-0.62
POSITIVE LOGITS
drunk
2.08
affor
1.94
impra
1.93
Drunk
1.87
increa
1.86
drunk
1.82
stockholm
1.80
drunken
1.80
sappi
1.77
aen
1.75
Activations Density 0.127%