INDEX
Explanations
proper nouns, specifically the name "Derek"
mentions of the name "Derek" in various contexts
New Auto-Interp
Negative Logits
loop
-0.81
overflow
-0.80
bus
-0.75
vill
-0.71
NAT
-0.70
Rat
-0.70
Gent
-0.68
Witch
-0.68
hop
-0.67
loops
-0.66
POSITIVE LOGITS
Derek
3.34
erek
2.20
Darren
1.08
Clyde
1.07
Raiders
1.06
Kelvin
1.03
Draper
1.01
Raider
0.99
Khal
0.99
Manziel
0.97
Activations Density 0.030%