Associative Transformer is a Sparse Representation Learner