INDEX
Explanations
strength, weakness, and power
New Auto-Interp
Negative Logits
cosh
0.87
গুলি
0.83
Space
0.82
zor
0.81
spazi
0.78
োন্ন
0.77
espaces
0.76
eeu
0.75
িজ
0.75
espacios
0.75
POSITIVE LOGITS
hold
1.14
holds
1.10
িশালী
1.09
mẽ
1.01
💪
0.99
weak
0.91
Weak
0.91
weak
0.90
winds
0.89
strong
0.87
Activations Density 0.499%