INDEX
Explanations
Gemma, llama, LaMDA, LGA
tokens that are named entities or proper nouns (product/model names, people, places, and other capitalized terms).
New Auto-Interp
Negative Logits
l
1.92
k
1.57
t
1.55
n
1.55
r
1.42
lari
1.41
j
1.40
ल
1.37
یم
1.29
il
1.23
POSITIVE LOGITS
า
1.64
2
1.51
σε
1.45
া
1.32
та
1.30
في
1.30
ாதி
1.30
ا
1.27
ة
1.24
্শন
1.22
Activations Density 0.769%