"The database table tbl has a string field DESCRIPTION. ARTIKELNR DESCRIPTION 104009400031900 S23 .."

jinxing RaqForum 41 No.
18 View • 4 Months ago

From SQL to SPL: Substring from a column of strings

The database table tbl has a string field DESCRIPTION.

ARTIKELNR	DESCRIPTION
104009400031900	S235JRG2C+C EN 10277/10 80 x 8 mm Blanker Flachstahl
104009800002950	Werksattest nach EN 10204
105009400092360	EN 10060 Inox Rund Ø 28 mm WNr1.4034 warmgefertigt
105009400068571	90 mm WNr1.4404 warmgefertigt, EN 10060, geschmiedet
105009400004420	WNr1.4301 Inox Vierkantstahl 6 x 6 mm EN 10278, blankgezogen
105009400008800	WNr1.4301 Inox Band geschnitten 25 x 4 mm, Breitentol. 25 m
105009400068600	WNr1.4112 Inox Rund 8.00 mm nach EN 10088-3 + A+C+SL, gesch
105009400068700	WNr1.4301 Inox Rund 36.00 mm nach EN 10088-3, geschliffen, T
105009400068800	WNr1.4104 Inox Rund 25.00 mm nach EN 10088-3, geschliffen, T
105009400010035	WNr1.4404 Inox Vierkantstahl 30 x 30 mm, gezogen EN 10278

Now we need to retrieve the word 'EN' and the subsequent string of numbers from the DESCRIPTION field. The string of numbers may consist entirely of digits, such as '10204', or it may contain special characters, such as '10277/10'. Caution: Do not retrieve punctuation marks; If the string does not contain 'EN', return null.

ARTIKELNR	NORM
104009400031900	EN 10277/10
104009800002950	EN 10204
105009400092360	EN 10060
105009400068571	EN 10060
105009400004420	EN 10278
105009400008800
105009400068600	EN 10088-3
105009400068700	EN 10088-3
105009400068800	EN 10088-3
105009400010035	EN 10278

SQL：

DECLARE @separator CHAR(1) = SPACE(1);
SELECT *
    , REPLACE(c.query('
          for $x in /root/r[text()="EN"]
          let $pos := count(root/r[. << $x]) + 1
          return if (xs:int(substring((/root/r[$pos + 1]/text())[1],1,5)) instance of xs:int) then
                data(/root/r[position()=($pos, $pos + 1)])
            else data(/root/r[$pos])
        ').value('text()[1]', 'NVARCHAR(30)')
        ,',', '') AS NORM
FROM @tbl AS t
CROSS APPLY (SELECT PATINDEX('%EN[0-9][0-9][0-9][0-9][0-9]%', DESCRIPTION)) AS t2(pos)  -- to handle Exception cases
CROSS APPLY (SELECT TRY_CAST('<root><r><![CDATA[' + 
    REPLACE(IIF(pos > 0, STUFF(DESCRIPTION,pos + 2,0, SPACE(1)), DESCRIPTION), @separator, ']]></r><r><![CDATA[') + 
    --REPLACE(DESCRIPTION, @separator, ']]></r><r><![CDATA[') + 
    ']]></r></root>' AS XML)) AS t1(c)
ORDER BY ID;

SQL can implement it using CROSS APPLY with XML syntax, but the code is complex and lengthy. Using regular expressions can also implement it, but the code is more difficult to understand. SPL provides string splitting functions and ordered calculation functions, with simple and easy to understand code:https://try.esproc.com/splx?4RS

	A
1	$select * from tbl.txt
2	=A1.new(ARTIKELNR, if(s=substr(DESCRIPTION,"EN ").split().select@c(pos("012346789/+-%_",~)).concat(),"EN "+s):NORM)

A1： Load data.

A2： Find the substring after "EN " from the large string, split it into a set by character, extract the preceding numeric members, as well as special characters such as +-%_ in the numeric string, and merge them into a numeric string. The select function is used for filtering, @c represents taking from the first member onward until the first member that makes the condition false. The pos function returns the position of the substring.

Question source：https://stackoverflow.com/questions/78159534/fetch-values-from-previous-non-null-value-rows

SPL Official Website 👉 https://www.esproc.com

SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL

SPL Learning Material 👉 https://c.esproc.com

SPL Source Code and Package 👉 https://github.com/SPLWare/esProc

Discord 👉 https://discord.gg/sxd59A8F2W

Youtube 👉 https://www.youtube.com/@esProc_SPL

Application