The Apache Software Foundation maintains a significant presence on GitHub, featuring a wide range of public repositories. Their projects primarily utilize programming languages such as Java, Rust, Python, Scala, Go, and C++. Notable repositories include Apache Superset, Apache ECharts, and Apache Airflow, all contributing to data processing and visualization.
Apache Superset is a Data Visualization and Data Exploration Platform
Apache ECharts is a powerful, interactive charting and data visualization library for browser
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Apache Spark - A unified analytics engine for large-scale data processing
The java implementation of Apache Dubbo. An RPC and microservice framework.
Apache Kafka - A distributed event streaming platform
Apache Flink
Empowering Data Intelligence with Distributed SQL for Sharding, Scalability, and Security Across All Databases.
Apache Casbin: an authorization library that supports access control models like ACL, RBAC, ABAC.
:kangaroo: - PouchDB is a pocket-sized database.
brpc is an Industrial-grade RPC framework using C++ Language, which is often used in high performance system such as Search, Storage, Machine learning, Advertisement, Recommendation etc. "brpc" means "better RPC".
Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
The Cloud-Native API Gateway and AI Gateway
Apache Hadoop
A Q&A platform software for teams at any scales. Whether it's a community forum, help center, or knowledge management platform, you can always count on Apache Answer.
Apache Doris is an easy-to-use, high performance and unified analytics database.
Apache Pulsar - distributed pub-sub messaging system
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
Open Machine Learning Compiler Framework
Apache Thrift
Open source transactional distributed database. Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure without compromising performance.
Apache JMeter open-source load testing tool for analyzing and measuring the performance of a variety of services
SeaTunnel is a multimodal, high-performance, distributed, massive data integration tool.
Apache Iceberg
Apache DataFusion SQL Query Engine
Apache ShenYu is a Java native API Gateway for service proxy, protocol conversion and API governance.
Apache Camel is an open source integration framework with 350+ connectors. Write routes in Java, YAML, or XML. Run on Spring Boot, Quarkus, or standalone. Apache License 2.0.
Upserts, Deletes And Incremental Processing on Big Data.
Apache NiFi
Apache Pinot - A realtime distributed OLAP datastore
Apache HBase
Apache Groovy: A powerful multi-faceted programming language for the JVM platform
Apache Calcite
Apache Maven core
Go Implementation For Apache Dubbo .
A blazingly fast multi-language serialization framework for idiomatic domain objects, schema IDL, and cross-language data exchange.
Apache Iggy: Hyper-Efficient Message Streaming at Laser Speed
Make stream processing easier! Easy-to-use streaming application development framework and operation platform.
Apache NuttX is a mature, real-time embedded operating system (RTOS)
The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).
Extensible SQL Lexer and Parser for Rust
Apache Avro is a data serialization system.
Apache Nutch is an extensible and scalable web crawler
A graph database that supports more than 100+ billion data, high performance and scalability (Include OLTP Engine & REST-API & Backends)
Mirror of Apache PDFBox
Apache NetBeans
Apache Parquet Java
Apache Commons Lang
Grails - the Web Application Framework
GoCQL Driver for Apache Cassandra®
Apache Parquet Format
Apache Lucene.NET is an open-source full-text search library written in C#, ported from the Apache Lucene project.
Build applications that make decisions (chatbots, agents, simulations, etc...). Monitor, trace, persist, and execute on your own infrastructure.
samples for Apache Dubbo
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
A cluster computing framework for processing large-scale geospatial data
Mirror of Apache POI gitbox. The Java API for Microsoft Documents.
Apache TinkerPop - a graph computing framework
Apache Polaris, the interoperable, open source catalog for Apache Iceberg
Mirror of Apache Kudu
EventMesh is a new generation serverless event middleware for building distributed event-driven applications.
The Apache Guacamole web application, official extensions, and JavaScript library.
Hop Orchestration Platform
Apache Struts is a free, open-source, MVC framework for creating elegant, modern Java web applications
One advanced and mature open-source MPP (Massively Parallel Processing) database. Open source alternative to Greenplum Database.
Apache Commons IO
Apache OFBiz is an open source product for the automation of enterprise processes. It includes framework components and business applications for ERP, CRM, E-Business/E-Commerce, Supply Chain Management and Manufacturing Resource Planning. OFBiz provides a foundation and starting point for reliable, secure and scalable enterprise solutions.
A scalable, mature and versatile web crawler based on Apache Storm
Apache Camel K is a lightweight integration platform, born on Kubernetes, with serverless superpowers
Apache StreamPipes - A self-service (Industrial) IoT toolbox to enable non-technical users to connect, analyze and explore IoT data streams.
Apache Commons CSV
Flink Agents is an Agentic AI framework based on Apache Flink
Read-only mirror of Apache SpamAssassin.
Website sources for the Apache Community Development Website
Apache Commons Validator
The Streaming-first HTTP server/module of Apache Pekko
Apache Maven Dependency Plugin
Apache Flink Website
Showcase Application to demonstrate features of Apache SkyWalking
Apache Phoenix Adapters
Apache Camel Website
Mirror of Apache XMLBeans
Apache Pekko gRPC
Apache Fory Website
ASF GitHub Actions Repository
Asynchronously writes journal and snapshot entries to configured JDBC databases so that Apache Pekko Actors can recover state
Apache Kvrocks Website
Apache Parquet Site
Apache Maven Distribution Tools
Apache Superset Kubernetes Operator
Apache Paimon Vector Index: pure Rust IVF-PQ for data lake vector search.
Apache superset (Incubating) website
Apache druid
Apache BifroMQ (Incubating) Website
Apache Maven Jarsigner
Apache SkyWalking next-generation UI (Horizon)
Apache TomEE published website
Apache camel
Apache Grails Website & Documentation
Apache flink
Apache develops various data processing and visualization tools on GitHub. Projects like Apache Superset and Apache Airflow highlight their focus on data analytics and workflow management, while other repositories support distributed event streaming and microservices.
The primary programming languages used by Apache include Java, Rust, Python, Scala, Go, and C++. These languages allow for the development of a diverse range of applications, from data visualization to distributed systems.
Yes, all of Apache's repositories on GitHub are public. This openness allows for community contributions and transparency in development, fostering collaboration on numerous projects across various domains.
Monitor The Apache Software Foundation with RepoGuard and get alerted the moment a new public repository appears.
Monitor this account