INDEX
Explanations
mentions of tampering or manipulation
references to tampering and related activities
New Auto-Interp
Negative Logits
anamo
-0.80
ilipp
-0.72
BuyableInstoreAndOnline
-0.71
zl
-0.69
Sigma
-0.68
sam
-0.68
RIC
-0.65
AIN
-0.65
hens
-0.65
auri
-0.64
POSITIVE LOGITS
tampering
0.99
tam
0.94
tink
0.80
ritten
0.79
etry
0.77
rodu
0.75
bley
0.72
aker
0.68
havoc
0.68
ington
0.67
Activations Density 0.041%