INDEX
Explanations
significant nouns and words related to quantity or classification
New Auto-Interp
Negative Logits
Ferd
-0.18
á»iji
-0.16
ÏĢλα
-0.16
acer
-0.16
uti
-0.15
çª
-0.15
ÙĴÙĩ
-0.14
à¹Ģà¸Ĺ
-0.14
Resident
-0.14
endale
-0.14
POSITIVE LOGITS
revers
0.16
andır
0.16
atory
0.15
ÙĪØ§ÙĨ
0.15
ahu
0.15
etta
0.14
behold
0.14
ço
0.14
ija
0.14
ãĤĦãģĻ
0.14
Activations Density 0.003%