INDEX
Explanations
text related to official and confidential documents
occurrences of the word "documents."
New Auto-Interp
Negative Logits
irth
-0.77
oker
-0.71
olen
-0.70
olls
-0.70
skill
-0.69
athetic
-0.68
anism
-0.68
gettable
-0.67
eful
-0.67
bid
-0.65
POSITIVE LOGITS
documents
1.26
document
1.04
Documents
1.00
Documents
0.96
papers
0.91
files
0.86
docs
0.82
TeX
0.82
Papers
0.80
Document
0.79
Activations Density 0.010%