INDEX
Explanations
the word "this" followed by other words
references to the document or post being discussed
New Auto-Interp
Negative Logits
Ĭ±
-0.88
mates
-0.76
zees
-0.75
ع
-0.71
akuya
-0.71
planes
-0.70
Nazis
-0.70
Americans
-0.69
iors
-0.68
ا
-0.67
POSITIVE LOGITS
article
1.59
blog
1.51
tutorial
1.38
essay
1.31
guide
1.30
FAQ
1.26
post
1.26
section
1.25
wiki
1.21
page
1.17
Activations Density 0.176%