INDEX
Explanations
mentions of specific trademarks or brands related to "Instinct."
terms related to legal designations or classifications
New Auto-Interp
Negative Logits
published
-0.67
Timber
-0.66
Wat
-0.66
Nets
-0.64
Dear
-0.62
targets
-0.62
blackmail
-0.61
flies
-0.61
Goodman
-0.61
Wa
-0.61
POSITIVE LOGITS
inct
4.84
inction
1.92
ingu
1.48
inctions
1.48
inguished
1.32
inguishable
1.23
antly
1.05
iple
0.98
ract
0.97
icent
0.97
Activations Density 0.009%