INDEX
Explanations
proper nouns
references to the term "Ja," which appears multiple times throughout the document
New Auto-Interp
Negative Logits
Debor
-0.82
atories
-0.70
ified
-0.69
Nationwide
-0.66
IFIED
-0.66
âĶģ
-0.66
Revelations
-0.65
ãĤĮ
-0.65
ifiers
-0.65
ä¹ĭ
-0.65
POSITIVE LOGITS
quet
1.13
ques
1.10
quez
1.04
vel
1.03
ihad
0.89
ignt
0.87
oust
0.87
unta
0.86
¶ħ
0.84
onse
0.84
Activations Density 0.021%