INDEX
Explanations
references to different layers, both literal and metaphorical
references to different layers in systems or structures
New Auto-Interp
Negative Logits
ãĥīãĥ©
-0.80
Predators
-0.77
STAR
-0.73
date
-0.71
etheus
-0.70
nia
-0.70
WAR
-0.69
speaking
-0.67
lishing
-0.67
FRE
-0.66
POSITIVE LOGITS
layers
1.44
layer
1.30
layer
1.00
Layer
0.98
Layer
0.96
thickness
0.92
coats
0.87
coat
0.82
opacity
0.81
layered
0.79
Activations Density 0.018%