INDEX
Explanations
English language and related concepts
New Auto-Interp
Negative Logits
'
0.79
are
0.64
anos
0.64
ara
0.63
’
0.61
ia
0.57
UR
0.56
ارين
0.54
azienda
0.54
elle
0.52
POSITIVE LOGITS
อังกฤษ
1.01
English
0.91
English
0.91
Анг
0.84
ENGLISH
0.82
Englisch
0.80
영어
0.78
Englishman
0.78
inglés
0.77
ENGLISH
0.76
Activations Density 0.023%