INDEX
Explanations
mentions of the term "Pirate Bay"
terms and references related to piracy and piracy culture
New Auto-Interp
Negative Logits
nyder
-0.85
uate
-0.84
mble
-0.83
eger
-0.80
Beir
-0.79
uates
-0.72
itably
-0.72
mberg
-0.72
uated
-0.71
ellar
-0.71
POSITIVE LOGITS
Pir
1.06
pirates
0.93
cean
0.80
Pirate
0.79
piracy
0.78
pirate
0.76
Pirates
0.76
squid
0.74
ship
0.73
Luffy
0.73
Activations Density 0.036%