1. Data retrieval and analysis

 

Text file meteorolog.txt stores meteorological data. According to daily average temparature TEMP, daily highest temprature MAX, and daily lowest temperature MIN, (1) compute monthly average temperature, highest tempearture and lowest temperature at BAODING station; (2) get records where the lowest temperature is lower than the whole year’s lowest temperature in BEIJING; (3) and get the top five records where the daily average temperatures are the highest.

Expected result sets:

SPL code:

A
1 =file(“meteorolog.txt”).import@t()
2 =A1.select(STATION:“BAODING”).groups@n(month(DATE):Month;round(avg(TEMP),2): AVGTEMP, max(MAX):MAX, min(MIN):MIN)
3 =A1.select(STATION:“BEIJING”).min(MIN)
4 =A1.select(MIN<A3)
5 =A1.top(5, -TEMP, ~)

This exercise involves the basic uses of SPL. A1 reads data from the text file as a table sequence, during which @t option enables reading the first line as column headers. Here f.import()function can be replaced by T(fn) function. A1’s code, for example, can be rewritten as =T(“meteorolog.txt”). T()function automatically identifies the file format according to the extension and reads the file as a table sequence. A2 gets records of BAODING station, and groups them using the groups() function, where the average temperature, the highest temperature and the lowest temperature are computed; since month values and group numbers correspond one by one, it uses groups@n to specify the group number to increase performance. A3 computes the yearly lowest temperature at BEIJING station:

A4 further gets records from all stations where the lowest temperature is lower than the yearly lowest temperature at BEIJING station.

A5 gets the five records where the daily average temperatures are the highest. TEMP is headed by a negative sign to sort the result records by temperature in desending order. The top() function, by specifying the last parameter ~, returns every whole record containing the eligible daily highest temperature rather than the highest temperature value.


2. Test data preparation
Contents and Exercise Data