INDEX
Explanations
phrases indicating foundational or underlying principles for various topics
New Auto-Interp
Negative Logits
displacement
-0.72
Scu
-0.72
Ginn
-0.71
tighets
-0.69
Displacement
-0.69
ി
-0.68
displacement
-0.68
gug
-0.67
textView
-0.66
Scu
-0.66
POSITIVE LOGITS
Based
1.28
based
1.23
Based
1.22
BASED
1.22
based
1.19
BASED
1.16
basé
1.12
bases
1.00
basée
0.99
Basing
0.97
Activations Density 0.076%