INDEX
Explanations
references to espionage and spy-related content
New Auto-Interp
Negative Logits
xual
-0.91
esville
-0.80
à¼
-0.74
clus
-0.72
ĸļ
-0.71
Kurd
-0.66
Interstitial
-0.65
Explain
-0.65
kell
-0.64
urther
-0.63
POSITIVE LOGITS
glass
1.00
oleon
0.93
sonian
0.83
ware
0.83
dropping
0.82
satellites
0.82
OSH
0.79
ionage
0.79
spying
0.77
agency
0.76
Activations Density 0.026%