Modern very large-scale integration (VLSI) designs usually consist of modules with various topological structures and functionalities. To better optimize such large and heterogeneous logic networks, it is essential to identify the structural and functional characteristics of its modules, and represent them with appropriate DAG types (such as AIG, MIG, XAG, etc.) for logic optimization. This paper proposes HeLO, a hetero-DAG logic optimization framework empowered by hierarchical clustering and graph learning. HeLO leverages a hierarchical clustering algorithm, which splits the original Boolean network into sub-circuits by considering both topological and functional characteristics. A novel graph neural network model is customized to generate the topological-functional embedding (used for distance calculation in hierarchical clustering) and predict the best-fit DAG type of each sub-circuit. Experimental results demonstrate that HeLO outperforms LSOracle, the SOTA heterogeneous logic optimization framework, in terms of node-depth product (for technology-independent logic optimization) and delay-area product (for technology mapping) by 8.7% and 6.9%, respectively.