INDEX
Explanations
expressions of support or opposition towards certain entities or causes, including political figures and legislative actions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
573
+0.12
0.4%
347
+0.11
0.3%
50
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
573
+0.12
0.04
347
+0.11
0.03
92
+0.10
0.04
Negative Logits
<bos>
-0.69
mybatisplus
-0.67
underland
-0.64
principalColumn
-0.62
ensibility
-0.59
calipsis
-0.59
mxArray
-0.59
énégal
-0.57
protoc
-0.57
DisplayMetrics
-0.56
POSITIVE LOGITS
depic
1.57
shenan
1.51
encomp
1.47
fortn
1.37
increa
1.37
strick
1.36
unspeak
1.33
attemp
1.33
intersper
1.33
affor
1.32
Activations Density 0.191%