INDEX
Explanations
instances of the word "hug"
references to hugging
New Auto-Interp
Negative Logits
iasis
-0.72
piracy
-0.67
piracy
-0.65
Punk
-0.65
DoS
-0.62
ourses
-0.61
protected
-0.60
dial
-0.59
secondary
-0.58
Pir
-0.58
POSITIVE LOGITS
glers
1.03
goodbye
1.03
gers
1.02
eness
1.02
Hug
0.98
hug
0.96
hugs
0.96
aneers
0.94
wrap
0.90
eson
0.88
Activations Density 0.013%