INDEX
Explanations
words related to spy activities or espionage activities
New Auto-Interp
Negative Logits
rament
-0.63
Transcript
-0.62
ptive
-0.61
po
-0.60
endar
-0.59
hett
-0.59
sequ
-0.55
cial
-0.55
claw
-0.55
andr
-0.54
POSITIVE LOGITS
disguise
1.15
sorts
0.77
Īè
0.69
heaven
0.69
nature
0.69
nomine
0.68
lishes
0.67
canvas
0.66
retty
0.65
steroids
0.65
Activations Density 0.366%