INDEX
Explanations
comparison of levels or degrees
New Auto-Interp
Negative Logits
Mow
0.47
Hathaway
0.42
invers
0.42
更是
0.41
다섯
0.40
duas
0.40
Mā
0.39
inversion
0.39
🫣
0.39
여섯
0.38
POSITIVE LOGITS
lesser
1.05
mindre
0.90
weaker
0.88
smaller
0.88
Lesser
0.85
Smaller
0.85
Secondary
0.81
Smaller
0.80
smaller
0.80
secondary
0.77
Activations Density 0.497%