The development of European power markets is highly influenced by integrated electricity and heat systems. Therefore, decarbonization policies for the electricity and heat sectors, as well as numerical models that are used to guide such policies, should consider cross-sectoral interdependencies. However, although many model-based policy assessments for the highly interconnected European electricity system exist, international studies that consider interactions with the heat sector are rare. In this contribution, we systematically study the potential benefits of integrated heat and power systems by conducting a model comparison experiment. Five large-scale market models covering electricity and heat supply were utilized to study the interactions between a rather simple coal replacement scenario and a more ambitious policy that supports decarbonization through power-to-heat. With a focus on flexibility provision, emissions reduction, and economic efficiency, although the models agree on the qualitative effects, there are considerable quantitative differences. For example, the estimated reductions in overall CO2 emissions range between 0.2 and 9.0 MtCO2/a for a coal replacement scenario and between 0.2 and 25.0 MtCO2/a for a power-to-heat scenario. Model differences can be attributed mainly to the level of detail of CHP modeling and the endogeneity of generation investments. Based on a detailed comparison of the modeling results, implications for modeling choices and political decisions are discussed.