INDEX
Explanations
Author initials and last names
New Auto-Interp
Negative Logits
๎
0.43
вто
0.43
ус
0.41
uch
0.40
ine
0.40
ling
0.40
oppel
0.39
volatile
0.36
cox
0.36
льзя
0.36
POSITIVE LOGITS
Bollywood
0.52
ﺍ
0.50
𝖺
0.50
Tutti
0.48
𝚊
0.47
détails
0.46
الم
0.45
piernas
0.45
המ
0.45
realises
0.44
Activations Density 0.002%