INDEX
Explanations
the word "Land" with varying degrees of specificity denoted by different activation values
the repetition of the word "Land" in various contexts
New Auto-Interp
Negative Logits
thodox
-0.77
sidx
-0.75
Downloadha
-0.74
ongyang
-0.73
umbn
-0.69
tremend
-0.68
pload
-0.68
aukee
-0.68
fancy
-0.66
vernment
-0.66
POSITIVE LOGITS
scape
1.10
lords
1.04
Land
1.04
Land
0.96
lord
0.94
Rover
0.90
lander
0.90
Acquisition
0.83
owner
0.83
rait
0.79
Activations Density 0.010%