INDEX
Explanations
words starting with "Se" followed by a high activation number
occurrences of the character sequence "Se"
New Auto-Interp
Negative Logits
Downloadha
-0.76
aneers
-0.74
wire
-0.73
dayName
-0.69
hoop
-0.66
realism
-0.66
CPC
-0.65
overhead
-0.65
crib
-0.63
deed
-0.63
POSITIVE LOGITS
venth
1.15
vere
1.12
quel
1.12
eker
1.09
asons
1.04
ems
1.03
lected
1.03
Se
1.00
ek
0.99
eking
0.99
Activations Density 0.006%