"Processing mixed computing problems with multiple data sources using esProc SPL is much simpler .."

lisongbo RaqForum 82 No.
1 Reply • 22 View • 4 Months ago

Using esProc SPL for multi data source calculations is very convenient and much simpler than logical data warehouses

Processing mixed computing problems with multiple data sources using esProc SPL is much simpler than using logical data warehouses. Just import a few jars into the application and write a few lines of script to get it done, without the hassle of building a logical data warehouse.
For example, to associate two tables in MongoDB and MySQL, write a script called orderAmount. splx:

	A
1	=connect("mysql")
2	=A1.query@x("SELECT o.order_id, o.user_id, o.order_date, oi.product_id, oi.quantity, oi.price FROM orders o JOIN order_items oi ON o.order_id = oi.order_id WHERE o.order_date >= CURDATE()- INTERVAL 1 MONTH")
3	=mongo_open("mongodb://127.0.0.1:27017/raqdb")
4	=mongo_shell@d(A3, "{'find':'products', 'filter': { 'category': {'$in': ['Tablets', 'Wearables', 'Audio'] } }}” )
5	=A2.join@i(product_id,A4:product_id,name,brand,category,attributes)
6	=A5.groups(category;sum(price*quantity):amount)
7	return A6

The front part is retrieving data from SQL and MongoShell using native data source interfaces, A5 and A6 perform join and aggregation. The application can obtain mixed computation result by calling the script via JDBC.
esProc supports many types of data sources, including JDBC data sources, NoSQL, Kafka, ES, GCS, Hadoop, JSON, CSV, and Excel, and the types can be expanded.
The computing power of esProc is provided by itself, including filtering, grouping, join, etc. It does not rely on data sources and can calculate together after reading the data, so it has natural mixed computing capabilities.
If the amount of data involved is relatively large, it also supports cursor-style reading for calculation and has a parallel mechanism, which can meet almost all computing scenarios.

The entire process is almost ready to use after installation. If it’s just for implementing multi-source mixed computing, using esProc is enough. The construction and operation of logical data warehouses are too heavy, and they should be used on data platforms for large institutions, but are not worth it in just one application.

Unlike logical data warehouses, esProc readings are written in SPL scripts and use native interfaces, which means that the script needs to be modified when the data source changes, it cannot be completely transparent to the underlying data source like logical data warehouses. Fortunately, only the data retrieval logic needs to be modified (with different data source interfaces), and the calculation logic does not need to be changed, which is acceptable.

SPL Official Website 👉 https://www.esproc.com

SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL

SPL Learning Material 👉 https://c.esproc.com

SPL Source Code and Package 👉 https://github.com/SPLWare/esProc

Discord 👉 https://discord.gg/sxd59A8F2W

Youtube 👉 https://www.youtube.com/@esProc_SPL

Promote