INDEX
Explanations
instances of language or terminology
references to language used in legal or political contexts
New Auto-Interp
Negative Logits
CLASSIFIED
-0.77
iary
-0.74
cot
-0.74
rients
-0.68
iaries
-0.66
TAIN
-0.66
opia
-0.65
Maurit
-0.63
umeric
-0.63
rique
-0.63
POSITIVE LOGITS
mith
0.87
barriers
0.83
witz
0.82
anguage
0.79
spoken
0.78
eloqu
0.75
describing
0.72
surrounding
0.71
constructs
0.71
language
0.71
Activations Density 0.080%