INDEX
Explanations
references to surveillance and espionage activities
references to surveillance and spying activities
New Auto-Interp
Negative Logits
gran
-0.75
creation
-0.67
usable
-0.67
shr
-0.66
joy
-0.66
Jer
-0.65
à¤
-0.65
Course
-0.65
nos
-0.65
ergy
-0.65
POSITIVE LOGITS
spying
1.11
ank
0.92
eaves
0.86
sonian
0.85
wiret
0.83
spoof
0.76
afia
0.75
ãĥ¼ãĥĨãĤ£
0.74
spies
0.73
ionage
0.72
Activations Density 0.010%