In this paper, we address the question of what minimal cognitive features are necessary for learning to process and extract grammatical structure from language. We build a minimalistic computational model containing only the two core features chunking and sequence memory and test its capacity to identify sentence borders and parse sentences in two artificial languages. The model has no prior linguistic knowledge and learns only by reinforcement of the identification of meaningful units. In simulations, the model turns out to be successful at its tasks, indicating that it is a good starting point for an extended model with ability to process and extract grammatical structure from larger corpora of natural language. We conclude that a model with the features chunking and sequence memory, that should in the future be complemented with the ability to establish hierarchical schemas, has the potential of describing the emergence of grammatical categories through language learning.