Harnessing NLP and Big Data to Solve Linguistic Challenges in Indonesian Humanoid Robots: Pathways to Innovation and Entrepreneurship

Authors

  • Syaifullah Universitas Lambung Mangkurat
  • Wenny Noorahim Universitas Lambung Mangkurat

DOI:

https://doi.org/10.64268/josce.v1i2.35

Keywords:

Digital Entrepreneurship, Innovation Ecosystem, Language Technology Commercialization, NLP-Based Startups, Supply Chain of AI Products

Abstract

Aim: Indonesian, as a national language, contains intricate linguistic features such as agglutinative morphology, idioms, and numerous dialectal variations. These characteristics present significant challenges in developing humanoid robots capable of natural interaction through Natural Language Processing (NLP). This study aims to address these linguistic complexities while exploring the entrepreneurial potential of localized NLP applications in Indonesia.

Methods: The research employs a qualitative literature review method, focusing on existing studies related to Indonesian NLP datasets, transformer-based language models, and speech technologies. Key sources include IndoNLI for inference, IndoSentiment for sentiment analysis, and case studies of humanoid robots like Lumen. The analysis also includes approaches utilizing Big Data, multi-pass decoders, and contextual language modeling to optimize performance in Indonesian linguistic settings.

Findings: Findings indicate that the successful development of Indonesian-speaking humanoid robots relies on context-aware NLP models trained on representative, culturally relevant datasets. Integrating multimodal systems and Big Data enables enhanced comprehension of idiomatic, regional, and informal expressions. The research also reveals that NLP-based innovations can be commercialized through AI-powered assistants, educational bots, and digital customer service, opening new opportunities for tech-driven entrepreneurship.

Significance: This study contributes to both technological advancement and business innovation by linking linguistic AI research with entrepreneurial applications. It underscores the importance of building a robust local data ecosystem and designing language models that reflect Indonesia’s linguistic diversity. These insights are vital not only for improving human-robot interaction but also for fostering sustainable digital entrepreneurship within emerging markets like Indonesia.

References

Budiharto, W. (2020). Deep Learning-Based Question Answering System For Intelligent Humanoid Robot. Journal Of Big Data, 7(1). Https://Doi.Org/10.1186/S40537-020-00341-6

Cahyawijaya, S., Aji, A. F., Lovenia, H., Winata, G. I., Wilie, B., Mahendra, R., Koto, F., Moeljadi, D.

Vincentio, K., Romadhony, A., & Purwarianti, A. (2022). Nusacrowd: A Call For Open And Reproducible NLP Research In Indonesian Languages. Arxiv. Https://Doi.Org/10.48550/Arxiv.2205.15960

Ernawati, I. A., Brawijaya, K. S., Aini, F., & Nurhayati, E. (2023). Perkembangan Ragam Bahasa Dalam Komunikasi Mahasiswa Di Lingkungan Kampus UPN “Veteran” Jawa Timur. Jurnal Pengabdian West Science, 2(6), 406–420. Https://Doi.Org/10.58812/Jpws.V2i6.388

Heinrich, S., & Wermter, S. (2011). Towards Robust Speech Recognition For Human-Robot Interaction. In Proceedings Of The IROS Workshop On Cognitive Neuroscience Robotics (CNR) (Pp. 29–34).

Jiono, M. (2020). Self Localization Based On Neighborhood Probability Mapping For Humanoid Robot. In 4th International Conference On Vocational Education And Training (ICOVET 2020) (Pp. 355–359). Https://Doi.Org/10.1109/ICOVET50258.2020.9230237

Mahendra, R., Aji, A. F., Louvan, S., Rahman, F., & Vania, C. (2021). Indonli: A Natural Language Inference Dataset For Indonesian. In Proceedings Of The 2021 Conference On Empirical Methods In Natural Language Processing (Pp. 10511–10527). Https://Doi.Org/10.18653/V1/2021.Emnlp-Main.821

Moleong, L. J. (2018). Metodologi Penelitian Kualitatif (Ed. Revisi). Remaja Rosdakarya.

Sya, S. S., & Prihatmanto, A. S. (2015). Design And Implementation Of Image Processing System For Lumen Social Robot-Humanoid As An Exhibition Guide For Electrical Engineering Days. Proceedings Of Electrical Engineering Days. Https://Ieeexplore.Ieee.Org/Document/7738307

Sugiyono. (2019). Metode Penelitian Kualitatif, Kuantitatif, Dan R&D. Alfabeta.

Chen, J., Liu, Z., Huang, X., Wu, C., Liu, Q., Jiang, G., Pu, Y., Lei, Y., Chen, X., Wang, X., Zheng, K., Lian, D., & Chen, E. (2024). When large language models meet personalization: Perspectives of challenges and opportunities. World Wide Web, 27(4), 42. https://doi.org/10.1007/s11280-024-01276-1

Cucchiarini, C., Hubers, F., & Strik, H. (2022). Learning L2 idioms in a CALL environment: The role of practice intensity, modality, and idiom properties. Computer Assisted Language Learning, 35(4), 863–891. https://doi.org/10.1080/09588221.2020.1752734

Diao, L., & Hu, P. (2021). Deep learning and multimodal target recognition of complex and ambiguous words in automated English learning system. Journal of Intelligent & Fuzzy Systems, 40(4), 7147–7158. https://doi.org/10.3233/JIFS-189543

Ferasso, M., Tortato, U., & Ikram, M. (2023). Mapping the Circular Economy in the Small and Medium-sized Enterprises field: An exploratory network analysis. Cleaner and Responsible Consumption, 11, 100149. https://doi.org/10.1016/j.clrc.2023.100149

Gruetzemacher, R., & Paradice, D. (2022). Deep Transfer Learning & Beyond: Transformer Language Models in Information Systems Research. ACM Comput. Surv., 54(10s), 204:1-204:35. https://doi.org/10.1145/3505245

Hussain, M. (2023). When, Where, and Which?: Navigating the Intersection of Computer Vision and Generative AI for Strategic Business Integration. IEEE Access, 11, 127202–127215. https://doi.org/10.1109/ACCESS.2023.3332468

Lin, J., Dai, X., Xi, Y., Liu, W., Chen, B., Zhang, H., Liu, Y., Wu, C., Li, X., Zhu, C., Guo, H., Yu, Y., Tang, R., & Zhang, W. (2025). How Can Recommender Systems Benefit from Large Language Models: A Survey. ACM Trans. Inf. Syst., 43(2), 28:1-28:47. https://doi.org/10.1145/3678004

Luckyardi, S., Karin, J., Rosmaladewi, R., Hufad, A., & Haristiani, N. (2024). Chatbots as Digital Language Tutors: Revolutionizing Education Through AI. Indonesian Journal of Science and Technology, 9(3), Article 3. https://doi.org/10.17509/ijost.v9i3.79514

Luckyardi, S., Munawaroh, S., Abduh, A., Rosmaladewi, R., Hufad, A., & Haristiani, N. (2024). Advancing Language Education in Indonesia: Integrating Technology and Innovations. ASEAN Journal of Science and Engineering, 4(3), Article 3. https://doi.org/10.17509/ajse.v4i3.79471

Nasution, A. H., & Onan, A. (2024). ChatGPT Label: Comparing the Quality of Human-Generated and LLM-Generated Annotations in Low-Resource Language NLP Tasks. IEEE Access, 12, 71876–71900. https://doi.org/10.1109/ACCESS.2024.3402809

Paramesha, M., Rane, N., & Rane, J. (2024). Big Data Analytics, Artificial Intelligence, Machine Learning, Internet of Things, and Blockchain for Enhanced Business Intelligence (SSRN Scholarly Paper 4855856). Social Science Research Network. https://doi.org/10.2139/ssrn.4855856

Ragno, L., Borboni, A., Vannetti, F., Amici, C., & Cusano, N. (2023). Application of Social Robots in Healthcare: Review on Characteristics, Requirements, Technical Solutions. Sensors, 23(15), Article 15. https://doi.org/10.3390/s23156820

Schiavo, F., Campitiello, L., Todino, M. D., & Di Tore, P. A. (2024). Educational Robots, Emotion Recognition and ASD: New Horizon in Special Education. Education Sciences, 14(3), Article 3. https://doi.org/10.3390/educsci14030258

Younis, H. A., Ruhaiyem, N. I. R., Ghaban, W., Gazem, N. A., & Nasser, M. (2023). A Systematic Literature Review on the Applications of Robots and Natural Language Processing in Education. Electronics, 12(13), Article 13. https://doi.org/10.3390/electronics12132864

Zhao, S., Wu, Y., Tsang, Y.-K., Sui, X., & Zhu, Z. (2021). Morpho-semantic analysis of ambiguous morphemes in Chinese compound word recognition: An fMRI study. Neuropsychologia, 157, 107862. https://doi.org/10.1016/j.neuropsychologia.2021.107862

Downloads

Published

2025-07-05

How to Cite

Syaifullah, S., & Noorahim, W. N. (2025). Harnessing NLP and Big Data to Solve Linguistic Challenges in Indonesian Humanoid Robots: Pathways to Innovation and Entrepreneurship. Journal of Supply Chain and Entrepreneurship, 1(2), 45–52. https://doi.org/10.64268/josce.v1i2.35