INDEX
Explanations
phrases indicating the importance and relevance of information or requests
New Auto-Interp
Negative Logits
ol
-0.16
olare
-0.16
ech
-0.15
ajo
-0.15
arian
-0.15
ive
-0.15
üz
-0.15
ird
-0.15
nod
-0.15
Votre
-0.14
POSITIVE LOGITS
tome
0.26
Ø¥ÙĦÙĬ
0.23
unto
0.21
ÙĦدÙĬ
0.20
μαζί
0.17
velt
0.17
="__
0.17
Ø¥ÙĦÙĬÙĩ
0.16
bagi
0.16
aney
0.15
Activations Density 0.432%