INDEX
Explanations
descriptions of a person's activities and belongings in a specific setting
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
906
+0.12
0.4%
1150
+0.10
0.3%
1705
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
946
+0.12
0.05
906
+0.10
-0.00
736
+0.09
0.06
Negative Logits
unspeak
-1.81
disagre
-1.78
shenan
-1.76
affor
-1.74
hairc
-1.73
apprehen
-1.73
snoopy
-1.72
impra
-1.68
tolerably
-1.65
strick
-1.65
POSITIVE LOGITS
<bos>
1.13
SourceChecksum
0.86
autorytatywna
0.86
חיצוניים
0.83
lenker
0.80
Mientras
0.79
حوالہ
0.78
Aholisi
0.77
Sklici
0.77
smithy
0.73
Activations Density 0.484%