INDEX
Explanations
references to refugee situations and humanitarian crises
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
156
+0.32
2.0%
186
+0.31
2.0%
56
+0.17
1.0%
Correlated Neurons
Index
P. Corr.
Cos Sim.
186
+0.32
0.12
56
+0.31
-0.00
210
+0.17
0.03
Negative Logits
↵
-4.97
č↵
-4.97
↵
-4.97
↵
-4.97
-4.97
↵↵
-4.97
↵
-4.97
-4.97
-4.97
<|outofrange|>
-4.97
POSITIVE LOGITS
fleeing
2.58
injured
2.07
flee
1.98
displaced
1.86
unemployed
1.83
emig
1.81
occupying
1.79
stric
1.78
illegally
1.77
seized
1.77
Activations Density 2.265%