INDEX
Explanations
mentions of languages and translations
New Auto-Interp
Negative Logits
Ireland
-0.18
anford
-0.18
australia
-0.15
Schneider
-0.15
ì¼
-0.15
Ire
-0.15
Hell
-0.14
ظ
-0.14
缣
-0.14
Compile
-0.14
POSITIVE LOGITS
Hebrew
0.35
Spanish
0.35
Arabic
0.34
Portuguese
0.33
Mandarin
0.32
French
0.31
Russian
0.29
Spanish
0.29
Hindi
0.29
Gujar
0.28
Activations Density 0.193%