INDEX
Explanations
references to animals and creatures
New Auto-Interp
Negative Logits
Reps
-0.76
Effective
-0.73
independents
-0.72
Policies
-0.71
)=(
-0.70
Administ
-0.68
Publishers
-0.68
ettings
-0.68
uries
-0.67
Consumers
-0.67
POSITIVE LOGITS
carc
1.14
frog
1.11
fish
1.08
tailed
1.03
mascot
1.02
beetle
1.01
pup
0.98
species
0.95
ishly
0.92
frog
0.91
Activations Density 0.269%