LLM Connection Graphs for Global Feature Extraction in Point Cloud Analysis

Zeyu Wang; Yue Zhu; Minghao Chen; Minghao Liu; Weijian Qin

doi:10.5281/zenodo.13318518

Authors

Zeyu Wang University of California, Los Angeles, United States
Yue Zhu Georgia Institute of Technology, United States
Minghao Chen top2top Technology Co. Ltd, China
Minghao Liu Arizona State University, United States
Weijian Qin Weill Cornell Medicine, NY, United States

DOI:

https://doi.org/10.5281/zenodo.13318518

Keywords:

graphs, cloud analysis, benchmarks

Abstract

Graph convolutional networks (GCNs) have effectively utilized local connections for point cloud analysis. How- ever, capturing distant dependencies (i.e., global features) with a single local connection graph, such as the Euclidean k-nearest neighbor graph, remains challenging. To ad- dress this, we introduce the Multi-Space Graph Convolutional Network (PointGCNN), which leverages reinforcement learning to adaptively construct connection graphs in multiple latent spaces, integrating both local and non-local dependencies. Initially, we encode and concatenate low- level local features from Euclidean and Eigenvalue spaces. Convolution layers are then hierarchically built, with each layer forming dynamic connection graphs to guide the propagation of low-level features. [1,2,3,4,11,14,16]These implicitly constructed graphs enable our model to uncover hidden dependencies. The assorted connections from different graphs support the extraction of fine-grained features from various perspectives, enhancing complex scene recognition. Thus, our model can capture multiple global contexts beyond the local scope of a single space, providing strong robustness against perturbations. Experimental results demonstrate that the proposed method achieves state-of-the-art performance on two major public point cloud benchmarks.

Downloads

Download data is not yet available.

References

Yang, R. (2024). CaseGPT: A case reasoning framework based on language models and retrieval-augmented generation. arXiv preprint arXiv:2407.07913.

Gao, H., Li, Y., Long, K., Yang, M., & Shen, Y. (2024). A survey for foundation models in autonomous driving. arXiv preprint arXiv:2402.01105.

Tao Y. (2023). SQBA: Sequential query-based blackbox attack, Fifth International Conference on Artificial Intelligence and Computer Science (AICS 2023), 12803, pp. 721-729.

Wantlin, Kathryn, et al. (2023). Benchmd: A benchmark for modality-agnostic learning on medical images and sensors. arXiv preprint arXiv:2304.08486.

Tao Y. (2023). Meta learning enabled adversarial defense. IEEE International Conference on Sensors, Electronics and Computer Engineering (ICSECE), pp. 1326-1330.

Tan, K., Li, P., & Beckers, T. (2024). Physics-constrained learning for PDE systems with uncertainty quantified port-hamiltonian models. arXiv preprint arXiv:2406.11809.

Chen, Y., Shi, H., Liu, X., Shi, T., Zhang, R., Liu, D., ... & Wu, F. (2024). TokenUnify: Scalable autoregressive visual pre-training with mixture token prediction. arXiv preprint arXiv:2405.16847.

Zhan, Donglin, et al. (2019). Adaptive transfer learning of multi-view time series classification. arXiv preprint arXiv:1910.07632.

Tan, K., Wang, J., & Kantaros, Y. (2023, June). Targeted adversarial attacks against neural network trajectory predictors. In: Learning for Dynamics and Control Conference, pp. 431-444. PMLR.

Liu, Y., Yang, H., & Wu, C. (2023). Unveiling patterns: A study on semi-supervised classification of strip surface defects. IEEE Access, 11, 119933-119946.

Jiang, L., Yu, C., Wu, Z., & Wang, Y. (2024). Advanced AI framework for enhanced detection and assessment of abdominal trauma: Integrating 3D segmentation with 2D CNN and RNN models. arXiv preprint arXiv:2407.16165.

Yan, H., Wang, Z., Xu, Z., Wang, Z., Wu, Z., & Lyu, R. (2024). Research on image super-resolution reconstruction mechanism based on convolutional neural network. arXiv preprint arXiv:2407.13211.

Jiang, L., Yu, C., Wu, Z., & Wang, Y. (2024). Advanced AI framework for enhanced detection and assessment of abdominal trauma: Integrating 3D segmentation with 2D CNN and RNN models. arXiv preprint arXiv:2407.16165.

Gao, Z., Wang, Q., Mei, T., Cheng, X., Zi, Y., & Yang, H. (2024). An enhanced encoder-decoder network architecture for reducing information loss in image semantic segmentation. arXiv preprint arXiv:2406.01605.

Jiang, L., Yu, C., Wu, Z., & Wang, Y. (2024). Advanced AI framework for enhanced detection and assessment of abdominal trauma: Integrating 3D segmentation with 2D CNN and RNN models. arXiv preprint arXiv:2407.16165.

Yao, J., Lai, Y., Kou, H., Wu, T., & Liu, R. (2024). QE-BEV: Query evolution for bird's eye view object detection in varied contexts. In: ACM Multimedia.

Song, X., Wu, D., Zhang, B., Peng, Z., Dang, B., Pan, F., & Wu, Z. (2023). Zeroprompt: streaming acoustic encoders are zero-shot masked lms. arXiv preprint arXiv:2305.10649.

Guan, B., Cao, J., Huang, B., Wang, Z., Wang, X., & Wang, Z. (2024). Integrated method of deep learning and large language model in speech recognition.

Liu, C., Ouyang, C., Chen, Y., Quilodrán-Casas, C. C., Ma, L., Fu, J., ... & Arcucci, R. (2023). T3d: Towards 3d medical image understanding through vision-language pre-training. arXiv preprint arXiv:2312.01529.

Wu, J., Hobbs, J., & Hovakimyan, N. (2023). Hallucination improves the performance of unsupervised visual representation learning. in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 16132-16143.

Dang, B., Zhao, W., Li, Y., Ma, D., Yu, Q., & Zhu, E. Y. (2024). Real-time pill identification for the visually impaired using deep learning. arXiv [Cs.CV]. Retrieved from: http://arxiv.org/abs/2405.05983.

Yao, J., Li, C., Sun, K., Cai, Y., Li, H., Ouyang, W., & Li, H. (2023, October). Ndc-scene: Boost monocular 3d semantic scene completion in normalized device coordinates space. in IEEE/CVF International Conference on Computer Vision (ICCV), pp. 9421-9431. IEEE Computer Society.

Dang, B., Ma, D., Li, S., Qi, Z., & Zhu, E. (07 2024). Deep learning-based snore sound analysis for the detection of night-time breathing disorders. Applied and Computational Engineering, 76, 109–114. doi:10.54254/2755-2721/76/20240574.

Pan, Xiaochao, et al. (2023). HarmonicNeRF: Geometry-informed synthetic view augmentation for 3D scene reconstruction in driving scenarios. ACM Multimedia.

Li, S., Dong, X., Ma, D., Dang, B., Zang, H., & Gong, Y. (2024). Utilizing the LightGBM algorithm for operator user credit assessment research. Applied and Computational Engineering, 75(1), 36–47. doi:10.54254/2755-2721/75/20240503.

Yang, R. (2024). CaseGPT: A case reasoning framework based on language models and retrieval-augmented generation. arXiv preprint arXiv:2407.07913.

Wang, J., Hong, S., Dong, Y., Li, Z., & Hu, J. (2024). Predicting stock market trends using lstm networks: overcoming RNN limitations for improved financial forecasting. Journal of Computer Science and Software Applications, 4(3), 1-7.

Sun, M., Feng, Z., Li, Z., Gu, W., & Gu, X. (2024). Enhancing financial risk management through lstm and extreme value theory: A high-frequency trading volume approach. Journal of Computer Technology and Software, 3(3).

Xu, Q., Feng, Z., Gong, C., Wu, X., Zhao, H., Ye, Z., Li, Z. and Wei, C. (2024). Applications of explainable AI in natural language processing. Global Academic Frontiers, 2(3), 51-64.

Xiao, Minheng, Shi Bo, & Zhizhong Wu. (2024). Multiple greedy quasi-newton methods for saddle point problems. arXiv preprint arXiv:2408.00241.

Qi, Zongqing, et al. (2024). Improved YOLOv5 based on attention mechanism and FasterNet for foreign object detection on railway and airway tracks. arXiv preprint arXiv:2403.08499.

Xiang, Ao, et al. (2024). A neural matrix decomposition recommender system model based on the multimodal large language model. arXiv preprint arXiv:2407.08942.

Mo, Yuhong, et al. (2024). Large Language Model (LLM) AI text generation detection based on transformer deep learning algorithm. International Journal of Engineering and Management Research 14(2), 154-159.

Ma, Danqing, et al. (2024). Transformer-based classification outcome prediction for multimodal stroke treatment. arXiv preprint arXiv:2404.12634.

Xiang, Ao, et al. (2024). A multimodal fusion network for student emotion recognition based on transformer and tensor product. arXiv preprint arXiv:2403.08511.

Ao Xiang, Jingyu Zhang, Qin Yang, Liyang Wang, & Yu Cheng. (2024). Research on splicing image detection algorithms based on natural image statistical characteristics. Journal of Image Processing Theory and Applications, 7, 43-52. http://dx.doi.org/10.23977/jipta.2024.070106.

Li, Zhenglin, et al. (2023). Stock market analysis and prediction using LSTM: A case study on technology stocks. Innovations in Applied Engineering and Technology, 1-6.

Li, Shaojie, Yuhong Mo, & Zhenglin Li. (2022). Automated pneumonia detection in chest x-ray images using deep learning model. Innovations in Applied Engineering and Technology, 1-6.

Mo, Yuhong, et al. (2024). Password complexity prediction based on roberta algorithm. Applied Science and Engineering Journal for Advanced Research, 3(3), 1-5.

Song, Jintong, et al. (2024). A comprehensive evaluation and comparison of enhanced learning methods. Academic Journal of Science and Technology, 10(3), 167-171.

Liu, Tianrui, et al. (2024). Spam detection and classification based on distilbert deep learning algorithm. Applied Science and Engineering Journal for Advanced Research, 3(3), 6-10.

Dai, Shuying, et al. (2024). The cloud-based design of unmanned constant temperature food delivery trolley in the context of artificial intelligence. Journal of Computer Technology and Applied Mathematics, 1(1), 6-12.

Mo, Yuhong, et al. (2024). Make scale invariant feature transform “Fly” with CUDA. International Journal of Engineering and Management Research, 14(3), 38-45.

He, Shuyao, et al. (2024). Lidar and monocular sensor fusion depth estimation. Applied Science and Engineering Journal for Advanced Research, 3(3), 20-26.

Liu, Jihang, et al. (2024). Unraveling large language models: From evolution to ethical implications-introduction to large language models. World Scientific Research Journal, 10(5), 97-102.

Mo, Yuhong & Zhang, Yuchen & Li, Hanzhe & Wang, Han & Yan, Xu. (2024). Prediction of heart failure patients based on multiple machine learning algorithms. Applied and Computational Engineering, 75, 1-7. doi:10.54254/2755-2721/75/20240498.

Zhu, Armando, et al. (2024). Cross-task multi-branch vision transformer for facial expression and mask wearing classification. Journal of Computer Technology and Applied Mathematics, 1(1), 46-53.

Li, Keqin, et al. (2024). Utilizing deep learning to optimize software development processes. Journal of Computer Technology and Applied Mathematics, 1(1), 70-76.

Li, Keqin, et al. (2024). The application of augmented reality (AR) in remote work and education. Journal of Computer Technology and Applied Mathematics, 1(1), 33-39.

Hong, Bo, et al. (2024). The application of artificial intelligence technology in assembly techniques within the industrial sector. Journal of Artificial Intelligence General Science (JAIGS), 5(1), 1-12.

Dai, Shuying, et al. (2024). AI-based NLP section discusses the application and effect of bag-of-words models and TF-IDF in NLP tasks. Journal of Artificial Intelligence General Science (JAIGS), 5(1), 13-21.

Zhao, Peng, et al. (2024). Task allocation planning based on hierarchical task network for national economic mobilization. Journal of Artificial Intelligence General Science (JAIGS), 5(1), 22-31.

Hu, X., Sun, Z., Nian, Y., Wang, Y., Dang, Y., Li, F., ... & Tao, C. (2024). Self-explainable graph neural network for alzheimer disease and related dementias risk prediction: Algorithm development and validation study. JMIR Aging, 7(1), e54748.

Li, F., Rasmy, L., Xiang, Y., Feng, J., Abdelhameed, A., Hu, X., ... & Tao, C. (2024). Dynamic prognosis prediction for patients on DAPT after drug‐eluting stent implantation: Model development and validation. Journal of the American Heart Association, 13(3), e029900.

He, J., Li, F., Hu, X., Li, J., Nian, Y., Wang, J., ... & Tao, C. (2022, June). Chemical-protein relation extraction with pre-trained prompt tuning. in IEEE 10th International Conference on Healthcare Informatics (ICHI), pp. 608-609. IEEE.

He, J., Li, F., Li, J., Hu, X., Nian, Y., Xiang, Y., ... & Tao, C. (2024). Prompt tuning in biomedical relation extraction. Journal of Healthcare Informatics Research, 8(2), 206-224.

Bo, S., & Xiao, M. (2022). Dynamic risk measurement by EVT based on stochastic volatility models via MCMC. arXiv preprint arXiv:2201.09434.

Xiao, M., Bo, S., & Wu, Z. (2024). Multiple greedy quasi-newton methods for saddle point problems. arXiv preprint arXiv:2408.00241.

LLM Connection Graphs for Global Feature Extraction in Point Cloud Analysis

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

ARK

License

Make a Submission

Information

Abstracting & Indexing

Current Issue