INDEX
Explanations
mentions of the word "shack."
occurrences of the word "shack" and related terms
New Auto-Interp
Negative Logits
ppo
-0.74
osate
-0.72
assic
-0.71
cker
-0.69
endars
-0.66
ague
-0.66
ppa
-0.65
icter
-0.65
teness
-0.64
ocious
-0.63
POSITIVE LOGITS
eering
1.03
lain
0.94
yrinth
0.93
nai
0.92
eers
0.91
nesses
0.88
mone
0.87
intosh
0.85
ledge
0.84
neys
0.84
Activations Density 0.042%