Designing Efficient Web Scraping Pipelines for Intelligent Systems: Java vs. Python


DİKİLİTAŞ Y., SAYAR A., Kırkaya A. E., Yalçın Ö.

7th International Conference on Intelligent and Fuzzy Systems, INFUS 2025, İstanbul, Türkiye, 29 - 31 Temmuz 2025, cilt.1530 LNNS, ss.557-564, (Tam Metin Bildiri) identifier identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Cilt numarası: 1530 LNNS
  • Doi Numarası: 10.1007/978-3-031-98565-2_60
  • Basıldığı Şehir: İstanbul
  • Basıldığı Ülke: Türkiye
  • Sayfa Sayıları: ss.557-564
  • Anahtar Kelimeler: HTML, Java, Python, Web Scraping
  • Kocaeli Üniversitesi Adresli: Evet

Özet

This paper presents a concise comparison between Java and Python for web scraping. Java and Python are two popular programming languages used in extracting data from websites. The comparison examines their syntax differences, available libraries (such as Jsoup for Java and BeautifulSoup/Scrapy for Python), ecosystem support, learning curve, and performance. Real-world use cases are explored to highlight situations in which each language shines. The findings provide valuable information for developers and researchers in making an informed decision about which language to choose for web scraping projects, considering factors such as ease of use, community support, and performance requirements.