INDEX
Explanations
phrases related to secrecy or hidden information
concepts related to secrecy and confidentiality
New Auto-Interp
Negative Logits
ergy
-0.82
bal
-0.78
balance
-0.77
ermanent
-0.70
di
-0.70
arte
-0.69
imus
-0.68
opez
-0.68
Interstitial
-0.68
anza
-0.68
POSITIVE LOGITS
recy
1.14
secrecy
1.01
shrouded
0.92
secrets
0.90
shroud
0.89
cloaked
0.84
rets
0.83
guarded
0.78
surrounding
0.75
secret
0.71
Activations Density 0.026%