INDEX
Explanations
the concept of universality across different contexts
universal concepts
New Auto-Interp
Negative Logits
Meksi
-0.50
躇
-0.46
borderSide
-0.42
Turquía
-0.41
Venedig
-0.41
Irán
-0.40
Moscú
-0.40
Fotografía
-0.39
venezolano
-0.38
tuples
-0.38
POSITIVE LOGITS
universal
2.17
universal
2.02
Universal
1.85
Universal
1.80
UNIVERSAL
1.75
universale
1.61
universality
1.52
universel
1.51
universally
1.48
universelle
1.42
Activations Density 0.006%