INDEX
Explanations
references to specific animals or animal-related terms
camel, giraffe, alpaca, ass, mule
New Auto-Interp
Negative Logits
icitis
-0.39
exitRule
-0.38
iyaki
-0.37
discre
-0.37
disclosed
-0.36
iParam
-0.36
:✨
-0.36
disclosures
-0.35
Alder
-0.35
chande
-0.35
POSITIVE LOGITS
camel
0.99
camel
0.93
ostrich
0.89
🐫
0.85
Camel
0.82
Camel
0.81
camels
0.79
🐪
0.78
hump
0.73
驼
0.69
Activations Density 0.023%