mercredi 25 février 2015

XPath: How to select all sibling nodes up to one fulfilling some condition?



I am trying to write me an XPath-expression returning all sibling nodes up to one, that satisfies a specific condition. In my specific case I have an (X)HTML list with list-items of which some have a specific class and other elements that have no class.


To visualize: I am standing at one of the list items that DO have a class "foo" (e.g. the li containing the text "D" and I want to get a list of the subsequent li's containing "E", "F" and "G", but none of the subsequent items containing "H", "I" and "J".



...
<li class="foo">A</li>
<li>B</li>
<li>C</li>
<li class="foo">D</li>
<li>E</li>
<li>F</li>
<li>G</li>
<li class="foo">H</li>
<li>I</li>
<li>J</li>
...


I am standing at one of the list items that DO have a class "foo" (e.g. the li containing the text "D" and I want to get a list of the subsequent li's containing "E", "F" and "G", but none of the subsequent items containing "H", "I" and "J".


I am using Java v1.8 and its built-in javax.xml.xpath package accessing a previously parsed org.w3c.dom.Document.


Note: I have googled extensively for a solution and I am aware that there are quite a number of very similar looking examples, even here on StackOverflow, but none of these worked for me! Whatever I tried and adapted to the case at hand always gave me just the first element only ("E" in this example) or none at all. :-(


Later addition:


Since I apparently expressed myself so badly, I am appending a test-program:



package pull_lis;

import java.io.FileInputStream;

import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathFactory;

import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.w3c.tidy.Tidy;

public class TestXPathExpression
{
public static void main(String[] args) throws Exception {
Tidy tidy = new Tidy();
XPathFactory xpathfactory = XPathFactory.newInstance();
XPath xpath = xpathfactory.newXPath();

Document doc = tidy.parseDOM(new FileInputStream("sample.xml"), System.out);

XPathExpression expr1 = xpath.compile("//li[@class='foo']");

// XPathExpression expr2 = xpath.compile("//li[@class='foo'][2]/following-sibling::li[@class='foo'][1]/preceding-sibling::li[preceding-sibling::li[@class='foo'][2]]");
XPathExpression expr2 = xpath.compile("/???"); // <<<< IT IS THIS EXPRESSION THAT I AM SEEKING

NodeList foos = (NodeList)expr1.evaluate(doc, XPathConstants.NODESET);
System.out.println(foos.getLength() + " foos found.");

for (int idx1 = 0; idx1 < foos.getLength(); idx1++) {
Node foo = foos.item(idx1);
System.out.println("foo[" + idx1 + "]: " + foo.getChildNodes().item(0).getNodeValue());
NodeList nodes = (NodeList)expr2.evaluate(foo, XPathConstants.NODESET);
for (int idx2 = 0; idx2 < nodes.getLength(); idx2++) {
Node node = nodes.item(idx2);
System.out.println(node.getChildNodes().item(0).getNodeValue());
}
}
}
}


sample.xml contains:



<html>
<head>
<title>Example</title>
</head>
<body>
<ul>
<li class="foo">A</li>
<li>B</li>
<li>C</li>
<li class="foo">D</li>
<li>E</li>
<li>F</li>
<li>G</li>
<li class="foo">H</li>
<li>I</li>
<li>J</li>
</ul>
</body>
</html>


If I let the above program run on sample.xml using the expression provided I get:



3 foos found.
foo[0]: A
E
F
G
foo[1]: D
E
F
G
foo[2]: H
E
F
G


but what I want/need is:



3 foos found.
foo[0]: A
B
C
foo[1]: D
E
F
G
foo[2]: H
I
J


Hope I could make myself clear this time... M.




Aucun commentaire:

Enregistrer un commentaire