INDEX
Explanations
phrases with positive tones related to reviews
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1413
+0.10
0.3%
1865
+0.10
0.3%
549
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1413
+0.10
0.03
2037
+0.10
0.03
549
+0.09
0.03
Negative Logits
reluct
-1.13
Hez
-1.13
Bartholo
-1.10
encomp
-1.05
Timp
-1.04
Varan
-1.03
depic
-1.02
Vaugh
-1.02
Juf
-1.01
wherea
-1.01
POSITIVE LOGITS
find
0.80
FIND
0.78
finding
0.73
find
0.73
finds
0.73
FIND
0.71
finding
0.63
disambiguazione
0.63
Find
0.63
found
0.62
Activations Density 0.139%