This paper presents a data-driven approach for predicting the propagation of traffic congestion at road seg-ments as a function of the congestion in their neighboring segments. In the past, this problem has mostly been addressed by modelling the traffic congestion over some standard physical phenomenon through which it is difficult to capture all the modalities of such a dynamic and complex system. While other recent works have focused on applying a generalized data-driven technique on the whole network at once, they often ignore intersection characteristics. On the contrary, we propose a city-wide ensemble of intersection level connected LSTM models and propose mechanisms for identifying congestion events using the predictions from the networks. To reduce the search space of likely congestion sinks we use the likelihood of congestion propagation in neighboring road segments of a congestion source that we learn from the past historical data. We validated our congestion forecasting framework on the real world traffic data of Nashville, USA and identified the onset of congestion in each of the neighboring segments of any congestion source with an average precision of 0.9269 and an average recall of 0.9118 tested over ten congestion events.