INDEX
Explanations
apology, ultimatum, extend, alibi, excuse
New Auto-Interp
Negative Logits
Announced
-0.85
Enlight
-0.85
Announcements
-0.84
anap
-0.83
Noice
-0.75
peny
-0.75
anonym
-0.75
Doo
-0.73
kirim
-0.73
ወ
-0.73
POSITIVE LOGITS
an
1.77
Ul
1.47
ultimatum
1.30
ulti
1.18
Ul
1.16
ulti
1.09
olive
0.98
uli
0.90
apology
0.88
HandleFunc
0.87
Activations Density 0.052%