INDEX
Explanations
terms related to universality or collective experiences
New Auto-Interp
Negative Logits
//
-0.60
Bauer
-0.57
هاند
-0.56
ässä
-0.55
/
-0.55
zt
-0.54
йом
-0.53
ktır
-0.53
DaoImpl
-0.52
Cone
-0.52
POSITIVE LOGITS
every
1.64
every
1.61
EVERY
1.60
EVERY
1.59
Every
1.53
Every
1.46
Ogni
1.21
Everywhere
1.11
Jede
1.07
Jedes
1.07
Activations Density 0.049%