INDEX
Explanations
target-related words or phrases
instances of the word "target" and its variations
New Auto-Interp
Negative Logits
loo
-0.78
maid
-0.73
iHUD
-0.70
fo
-0.70
SourceFile
-0.70
lycer
-0.68
cia
-0.66
Expedition
-0.64
ublic
-0.64
BuyableInstoreAndOnline
-0.62
POSITIVE LOGITS
ted
1.37
ting
0.99
ched
0.83
audience
0.75
squarely
0.72
eers
0.71
oided
0.70
ivated
0.69
targets
0.67
chers
0.66
Activations Density 0.046%