While interacting with their environment, animals form memories of state-action-outcome sequences that enable them to predict future states and make informed decisions. How this capability emerges in neural populations remains an open question. We investigated this issue by training vanilla recurrent neural networks (RNNs) to navigate randomly generated graphs and assessing whether a simple next-state prediction objective is sufficient to induce generalizable transition representations. The RNNs explored graphs by randomly choosing actions that triggered movement between adjacent nodes, while learning to predict the next node from current node-action pairs. Critically, we created training and test sets with non-overlapping (node, action) transitions, forcing the network to generalize rather than memorize. Despite never seeing test transitions during training, the RNN achieved near-optimal performance on small graphs. Analysis of hidden states revealed that the network spontaneously formed transition-specific representations: the same (node, action) pair encountered across different exploration sequences maped to similar hidden states, producing distinct clusters in representational space. These results suggest that a next-state prediction objective may support generalizable representations subserving world models in biological neural networks.