INDEX
Explanations
references to leaked documents and negotiations in political contexts
New Auto-Interp
Negative Logits
ATAB
-0.18
atab
-0.16
unas
-0.15
ensex
-0.15
exerc
-0.15
esson
-0.14
fitte
-0.14
kli
-0.14
ilk
-0.14
iffs
-0.14
POSITIVE LOGITS
admissions
0.23
admission
0.23
revealed
0.22
admitted
0.21
candid
0.20
Admission
0.20
records
0.19
documents
0.19
revealing
0.18
reveals
0.18
Activations Density 0.225%