Fine-Grained Object Detection and Manipulation with Segmentation-Conditioned Perceiver-Actor