INDEX
Explanations
specific nouns and concepts
New Auto-Interp
Negative Logits
ק
0.47
yogurt
0.46
curd
0.46
shrimp
0.45
نہیں۔
0.43
dict
0.42
BOOL
0.42
copying
0.42
meringue
0.41
triggered
0.41
POSITIVE LOGITS
anın
0.47
arów
0.46
ostęp
0.45
vän
0.43
igheder
0.42
ઓની
0.42
있다는
0.41
中的
0.41
записи
0.41
amot
0.41
Activations Density 0.000%