In the field of pharmacochemistry, it is a time-consuming and expensive process for the new drug development. The existing drug design methods face a significant challenge in terms of generation efficiency and quality.


In this paper, we proposed a novel molecular generation strategy and optimization based on A2C reinforcement learning. In molecular generation strategy, we adopted transformer-DNN to retain the scaffolds advantages, while accounting for the generated molecules’ similarity and internal diversity by dynamic parameter adjustment, further improving the overall quality of molecule generation. In molecular optimization, we introduced heterogeneous parallel supercomputing for large-scale molecular docking based on message passing interface communication technology to rapidly obtain bioactive information, thereby enhancing the efficiency of drug design. Experiments show that our model can generate high-quality molecules with multi-objective properties at a high generation efficiency, with effectiveness and novelty close to 100%. Moreover, we used our method to assist shandong university school of pharmacy to find several candidate drugs molecules of anti-PEDV.

Availability and implementation

The datasets involved in this method and the source code are freely available to academic users at

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.