INDEX
Explanations
phrases related to tampering or manipulation
instances of the word "tampering" and related forms in different contexts
New Auto-Interp
Negative Logits
ĺħ
-0.77
falls
-0.73
hens
-0.70
Primal
-0.70
RG
-0.66
sson
-0.64
ppo
-0.63
pu
-0.63
Adren
-0.61
Route
-0.60
POSITIVE LOGITS
tampering
1.21
tink
1.03
tam
0.95
iddled
0.88
aven
0.74
inker
0.73
redd
0.70
ritten
0.70
ocalypse
0.69
arks
0.69
Activations Density 0.020%