INDEX
Explanations
references to specific times (specific dates, past times), locations, and actions (such as searching and taking actions)
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
939
+0.09
0.3%
2019
+0.09
0.3%
856
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
939
+0.09
0.06
1967
+0.09
0.04
1567
+0.08
0.03
Negative Logits
kado
-0.71
kram
-0.69
uhr
-0.67
palab
-0.66
horri
-0.65
jaja
-0.64
gero
-0.64
ingrat
-0.64
mercen
-0.63
meis
-0.62
POSITIVE LOGITS
amenities
0.47
europa
0.46
dort
0.44
InSection
0.44
딪
0.42
withIdentifier
0.42
そこで
0.42
SharedCtor
0.41
IntoConstraints
0.41
dzies
0.41
Activations Density 0.576%