INDEX
Explanations
repetitive phrases indicating frequency or universality
New Auto-Interp
Negative Logits
//
-0.67
Bauer
-0.65
zt
-0.64
Cone
-0.60
/
-0.59
йом
-0.58
Azur
-0.58
dados
-0.58
Shane
-0.57
lze
-0.56
POSITIVE LOGITS
every
1.97
every
1.94
EVERY
1.88
Every
1.83
EVERY
1.81
Every
1.81
Ogni
1.43
Jedes
1.31
Chaque
1.24
Ogni
1.24
Activations Density 0.117%