INDEX
Explanations
instances where the word "walls" is mentioned in the text
references to walls and their conditions or features
New Auto-Interp
Negative Logits
inal
-0.65
onement
-0.65
atorium
-0.64
avez
-0.63
ya
-0.63
arin
-0.63
lich
-0.63
capital
-0.62
INA
-0.62
sb
-0.62
POSITIVE LOGITS
walls
1.20
mith
1.08
creen
1.05
Walls
0.98
papers
0.86
matter
0.82
paper
0.81
wall
0.78
cape
0.77
wallpaper
0.76
Activations Density 0.032%