INDEX
Explanations
phrases related to education policies and systemic issues in a school setting
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
604
+0.11
0.3%
1726
+0.08
0.2%
872
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1383
+0.11
0.04
1951
+0.08
0.04
1477
+0.07
0.05
Negative Logits
disagre
-1.46
reluct
-1.44
maneu
-1.35
depic
-1.33
uninten
-1.33
ftu
-1.31
inev
-1.29
apprehen
-1.29
fta
-1.27
resear
-1.27
POSITIVE LOGITS
root
0.75
problem
0.71
root
0.69
underlying
0.67
problems
0.66
blame
0.62
Root
0.62
issue
0.61
issues
0.61
ROOT
0.61
Activations Density 0.511%