INDEX
Explanations
logical and physical distinctions
New Auto-Interp
Negative Logits
meister
-0.84
Gigabyte
-0.77
jogo
-0.74
Dragon
-0.72
лись
-0.71
Movies
-0.71
ENABLED
-0.71
reacción
-0.70
Zivil
-0.69
箋
-0.69
POSITIVE LOGITS
entities
1.00
⌢
0.95
qualities
0.95
論
0.93
physical
0.93
quantities
0.90
Physical
0.89
Physical
0.87
istically
0.86
logically
0.85
Activations Density 0.025%