INDEX
Explanations
phrases that express a sense of urgency or inevitability in various situations
New Auto-Interp
Negative Logits
or
-0.52
(
-0.48
Сере
-0.47
me
-0.46
-0.45
zin
-0.44
representing
-0.44
mit
-0.43
const
-0.43
[
-0.43
POSITIVE LOGITS
Tudo
1.16
المعيارى
1.09
Tudo
1.06
tudo
0.99
transférez
0.98
betweenstory
0.96
^(@)
0.95
kasarigan
0.93
Everything
0.91
abestanden
0.90
Activations Density 0.146%