INDEX
Explanations
words related to leaked or unauthorized information/documents
references to leaked information or documents
New Auto-Interp
Negative Logits
rior
-0.71
Sax
-0.68
Atkins
-0.61
Ages
-0.61
joy
-0.59
.}
-0.59
orest
-0.59
rest
-0.59
ians
-0.59
orient
-0.58
POSITIVE LOGITS
leaked
3.85
leaks
2.56
leak
2.39
leaking
2.15
Leaks
1.66
leakage
1.56
spilled
1.55
Wikileaks
1.49
circulated
1.45
hacked
1.43
Activations Density 0.012%