INDEX
Explanations
occurrences of the word "It" in various contexts
New Auto-Interp
Negative Logits
görmek
-0.15
ÂĮ
-0.14
ãĥ¼ãĥ©
-0.14
.sul
-0.14
è·
-0.14
heaps
-0.14
одав
-0.13
Tokenizer
-0.13
ä»¶
-0.13
lossen
-0.13
POSITIVE LOGITS
alia
0.20
alc
0.18
ching
0.18
ald
0.17
seems
0.16
alis
0.15
alo
0.15
olo
0.15
semb
0.14
official
0.14
Activations Density 0.095%