INDEX
Explanations
spy-related words
terms related to spying or surveillance
New Auto-Interp
Negative Logits
pite
-0.77
hirt
-0.72
rawn
-0.67
inished
-0.66
ension
-0.65
urdue
-0.63
condensed
-0.63
separation
-0.62
sob
-0.61
ribbon
-0.60
POSITIVE LOGITS
hawk
0.87
onom
0.78
finder
0.78
Aware
0.77
Detect
0.76
ostics
0.75
rors
0.74
seeker
0.73
Watch
0.72
spies
0.71
Activations Density 0.071%