The FILTERXML function returns specific data from XML content by using the specified xpath.
- The FILTERXML function is not available in Excel for the web and Excel for Mac.
- This function may appear in the function gallery in Excel for Mac, but it relies on features of the Windows operating system, so it will not return results on Mac.
Syntax
FILTERXML(xml, xpath)
The FILTERXML function syntax has the following arguments.
Argument | Description |
---|---|
xml | Required. A string in valid XML format |
xpath | Required. A string in standard XPath format |
- If xml is not valid, FILTERXML returns the
#VALUE!
error value. - If xml contains a namespace with a prefix that is not valid, FILTERXML returns the
#VALUE!
error value.
Examples
This example uses the FILTERXML function on XML data returned in cell B2
, which is provided by the results of the WEBSERVICE function on the web query in cell B1
.
The data from FILTERXML is returned in cells B3:B12
and C3:C12
, and shows the most recently updated query results on Wikipedia and the time of their updates (in UTC, "Coordinated Universal Time").
Cell B1
contain the query url string:
https://en.wikipedia.org/w/api.php?action=query&list=recentchanges&rcnamespace=0&format=xml
Cell B2
contain the formula:
=WEBSERVICE(B1)
Cell B2
returns:
<?xml version="1.0"?>
<api batchcomplete="">
<continue rccontinue="20221015085902|1559952690" continue="-||"/>
<query>
<recentchanges>
<rc type="edit" ns="0" title="2022 NASCAR Cup Series" pageid="64612305" revid="1116192151" old_revid="1115951681" rcid="1559952705" timestamp="2022-10-15T08:59:07Z"/>
<rc type="edit" ns="0" title="Bothrocara" pageid="5278803" revid="1116192149" old_revid="1017742971" rcid="1559952703" timestamp="2022-10-15T08:59:06Z"/>
<rc type="edit" ns="0" title="Cisnormativity" pageid="71996420" revid="1116192148" old_revid="1116189075" rcid="1559952702" timestamp="2022-10-15T08:59:05Z"/>
<rc type="edit" ns="0" title="Steppe Arena" pageid="60625799" revid="1116192147" old_revid="1116191896" rcid="1559952701" timestamp="2022-10-15T08:59:05Z"/>
<rc type="edit" ns="0" title="Marius Bülter" pageid="58074752" revid="1116192146" old_revid="1112795402" rcid="1559952700" timestamp="2022-10-15T08:59:05Z"/>
<rc type="edit" ns="0" title="Edward Eager" pageid="1633389" revid="1116192145" old_revid="1114631450" rcid="1559952699" timestamp="2022-10-15T08:59:05Z"/>
<rc type="edit" ns="0" title="Galp Energia" pageid="5969536" revid="1116192141" old_revid="1109334278" rcid="1559952696" timestamp="2022-10-15T08:59:03Z"/>
<rc type="edit" ns="0" title="Kaltungo Emirate" pageid="72011272" revid="1116192142" old_revid="1116192102" rcid="1559952694" timestamp="2022-10-15T08:59:03Z"/>
<rc type="edit" ns="0" title="List of Justice League episodes" pageid="2910641" revid="1116192139" old_revid="1116190255" rcid="1559952692" timestamp="2022-10-15T08:59:03Z"/>
<rc type="edit" ns="0" title="Joy dol" pageid="71949265" revid="1116192143" old_revid="1116192014" rcid="1559952697" timestamp="2022-10-15T08:59:02Z"/>
</recentchanges>
</query>
</api>
Cells B3:B12
contain the formula:
=FILTERXML(B2,"//rc/@title")
Cells B3:B12
returns:
2022 NASCAR Cup Series
Bothrocara
Cisnormativity
Steppe Arena
Marius Bülter
Edward Eager
Galp Energia
Kaltungo Emirate
List of Justice League episodes
Joy dol
Cells C3:C12
contain the formula:
=FILTERXML(B2,"//rc/@timestamp")
Cells C3:C12
formated with a custom formats: yyyy-mm-ddThh:mm:ss.000Z
, it returns:
2022-10-15T08:59:07.000Z
2022-10-15T08:59:06.000Z
2022-10-15T08:59:05.000Z
2022-10-15T08:59:05.000Z
2022-10-15T08:59:05.000Z
2022-10-15T08:59:05.000Z
2022-10-15T08:59:03.000Z
2022-10-15T08:59:03.000Z
2022-10-15T08:59:03.000Z
2022-10-15T08:59:02.000Z
If you have a current version of Microsoft 365, then you can simply enter the formula in the top-left-cell of the output range, then press ENTER to confirm the formula as a dynamic array formula. Otherwise, the formula must be entered as a legacy array formula by first selecting the output range, entering the formula in the top-left-cell of the output range, and then pressing CTRL+SHIFT+ENTER to confirm it. Excel inserts curly brackets at the beginning and end of the formula for you. For more information on array formulas, see Guidelines and examples of array formulas.
You can also use FILTERXML in conjunction with the ENCODEURL and WEBSERVICE functions. For example, you can use the markitondemand.com API service to return information about a company stock:
=WEBSERVICE("http://dev.markitondemand.com/MODApis/Api/Quote/xml?symbol="&ENCODEURL(C2))
Where you have a stock ticker in cell C2
. You can then use the FILTERXML function to extract specific data about that stock. For example, to get the Last Price of the stock ticker in cell C2
, you can use:
=FILTERXML(WEBSERVICE("http://dev.markitondemand.com/MODApis/Api/Quote/xml?symbol="&ENCODEURL(C2)),"//QuoteApiModel/Data/LastPrice")
Convert string to array
In cell A1 there is a string separated by ",
":
A,1,b,2,Dx
Now, how can I convert A1 to an array using a formula?
{"A";1;"b";2;"Dx"}
IF you use MID function:
=TRIM(MID(SUBSTITUTE(A1,",",REPT(" ",LEN(A1))),(ROW(INDIRECT("1:"&(LEN(A1)-LEN(SUBSTITUTE(A1,",",""))+1)))-1)*LEN(A1)+1,LEN(A1)))
But, if you use the FILTERXML function, it's very simple:
=FILTERXML("<r><c>"&SUBSTITUTE(A1,",","</c><c>")&"</c></r>","//c")
In this formula, you can replace "r
" or "c
" with your favorite string.