INDEX
Explanations
records of scores or numbers
New Auto-Interp
Negative Logits
sign
-1.20
signing
-1.15
Sign
-1.11
Sign
-1.11
only
-0.99
chose
-0.92
signed
-0.91
サイン
-0.86
Only
-0.86
签
-0.85
POSITIVE LOGITS
monstruos
1.05
1.00
superhuman
0.98
nudo
0.98
artements
0.96
brechen
0.94
voul
0.94
hoge
0.93
üstü
0.93
REME
0.92
Activations Density 0.044%