INDEX
Explanations
assertions of "good news" or positive outcomes in various contexts
New Auto-Interp
Negative Logits
gnore
-0.41
Tembelea
-0.38
klaar
-0.36
Superhosts
-0.36
ronpa
-0.36
EndTag
-0.35
fjspx
-0.35
formik
-0.35
Ante
-0.35
Infra
-0.35
POSITIVE LOGITS
news
3.67
news
2.94
News
2.81
News
2.72
NEWS
2.67
NEWS
2.45
noticias
2.42
berita
2.31
noticia
2.30
nieuws
2.23
Activations Density 0.678%