INDEX
Explanations
mentions of specific individuals' names and terms associated with microwaves and decomposition
New Auto-Interp
Negative Logits
WAYS
-0.74
Clockwork
-0.72
tips
-0.70
tenance
-0.68
Colonial
-0.66
Interstitial
-0.65
cycle
-0.65
Gemini
-0.64
Magikarp
-0.64
tarian
-0.63
POSITIVE LOGITS
yre
1.30
osh
1.17
eneg
0.98
ott
0.96
ire
0.95
iet
0.94
orm
0.92
athy
0.92
olly
0.92
ermott
0.92
Activations Density 0.005%