INDEX
Explanations
phrases related to legal cases and proceedings
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.17
0.5%
752
+0.12
0.4%
260
+0.11
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
752
+0.17
0.07
16
+0.12
0.08
260
+0.11
0.05
Negative Logits
.
-0.56
;
-0.48
,
-0.47
Jeografia
-0.46
parution
-0.45
opsida
-0.44
hoort
-0.43
literals
-0.43
:
-0.42
دانشنامهٔ
-0.41
POSITIVE LOGITS
purtroppo
0.75
ibiza
0.74
sembrano
0.73
squa
0.72
vorrei
0.70
tew
0.70
specialmente
0.70
quitt
0.69
waer
0.69
lidl
0.69
Activations Density 0.492%