INDEX
Explanations
mentions of trips or travel
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1325
+0.15
0.5%
78
+0.14
0.5%
1870
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
78
+0.15
0.02
1325
+0.14
0.02
694
+0.12
0.02
Negative Logits
parem
-0.47
zara
-0.47
sena
-0.46
ordina
-0.44
ilidades
-0.43
Selamat
-0.42
zima
-0.42
impon
-0.42
tus
-0.41
Spes
-0.41
POSITIVE LOGITS
trip
1.24
trip
1.11
Trip
1.06
trips
1.05
Trips
1.05
Trip
1.02
trips
0.97
TRIP
0.91
TRIP
0.89
Trips
0.89
Activations Density 0.064%