I found a few problems parsing some xml with the functions below so started the parsing from scratch and I'm pretty happy with the results it's giving, especially the way I've structured the arrays, so somebody else might find it useful.
http://www.phpclasses.org/browse/package/4510.html
xml_parse_into_struct
(PHP 4, PHP 5)
xml_parse_into_struct — 将 XML 数据解析到数组中
说明
int xml_parse_into_struct
( resource $parser
, string $data
, array &$values
[, array &$index
] )
该函数将 XML 文件解析到两个对应的数组中,index 参数含有指向 values 数组中对应值的指针。最后两个数组参数可由指针传递给函数。
Note: xml_parse_into_struct() 失败返回 0,成功返回 1。这和 FALSE 与 TRUE 不同,使用例如 === 的运算符时要注意。
以下范例显示了由该函数生成的数组的内部结构。我们简单地将一个 note 嵌入到一个 para 标记中,解析后我们可以打印出生成的数组的结构:
Example#1 xml_parse_into_struct() 示例
<?php
$simple = "<para><note>simple note</note></para>";
$p = xml_parser_create();
xml_parse_into_struct($p, $simple, $vals, $index);
xml_parser_free($p);
echo "Index array\n";
print_r($index);
echo "\nVals array\n";
print_r($vals);
?>
运行以上代码,我们得到的输出将是:
Index array Array ( [PARA] => Array ( [0] => 0 [1] => 2 ) [NOTE] => Array ( [0] => 1 ) ) Vals array Array ( [0] => Array ( [tag] => PARA [type] => open [level] => 1 ) [1] => Array ( [tag] => NOTE [type] => complete [level] => 2 [value] => simple note ) [2] => Array ( [tag] => PARA [type] => close [level] => 1 ) )
如果您的 XML 文档很复杂,基于该文档的事件处理(Event-driven)解析(基于 expat 扩展库)也会对应的变得复杂。该函数生成的并非 DOM 风格的对象,而是横向的树状结构。因此,我们能够方便的建立表达 XML 文件数据的对象。我们假设以下 XML 文件表示一个关于氨基酸信息的小型数据库:
Example#2 moldb.xml - 分子信息的小型数据库
<?xml version="1.0"?>
<moldb>
<molecule>
<name>Alanine</name>
<symbol>ala</symbol>
<code>A</code>
<type>hydrophobic</type>
</molecule>
<molecule>
<name>Lysine</name>
<symbol>lys</symbol>
<code>K</code>
<type>charged</type>
</molecule>
</moldb>
Example#3 parsemoldb.php - 将 moldb.xml 解析到分子(molecular)对象的数组中
<?php
class AminoAcid {
var $name; // aa 姓名
var $symbol; // 三字母符号
var $code; // 单字母代码
var $type; // hydrophobic, charged 或 neutral
function AminoAcid ($aa)
{
foreach ($aa as $k=>$v)
$this->$k = $aa[$k];
}
}
function readDatabase($filename)
{
// 读取 aminoacids 的 XML 数据
$data = implode("",file($filename));
$parser = xml_parser_create();
xml_parser_set_option($parser, XML_OPTION_CASE_FOLDING, 0);
xml_parser_set_option($parser, XML_OPTION_SKIP_WHITE, 1);
xml_parse_into_struct($parser, $data, $values, $tags);
xml_parser_free($parser);
// 遍历 XML 结构
foreach ($tags as $key=>$val) {
if ($key == "molecule") {
$molranges = $val;
// each contiguous pair of array entries are the
// lower and upper range for each molecule definition
for ($i=0; $i < count($molranges); $i+=2) {
$offset = $molranges[$i] + 1;
$len = $molranges[$i + 1] - $offset;
$tdb[] = parseMol(array_slice($values, $offset, $len));
}
} else {
continue;
}
}
return $tdb;
}
function parseMol($mvalues)
{
for ($i=0; $i < count($mvalues); $i++) {
$mol[$mvalues[$i]["tag"]] = $mvalues[$i]["value"];
}
return new AminoAcid($mol);
}
$db = readDatabase("moldb.xml");
echo "** Database of AminoAcid objects:\n";
print_r($db);
?>
** Database of AminoAcid objects: Array ( [0] => aminoacid Object ( [name] => Alanine [symbol] => ala [code] => A [type] => hydrophobic ) [1] => aminoacid Object ( [name] => Lysine [symbol] => lys [code] => K [type] => charged ) )
xml_parse_into_struct
Wicked Father
22-Apr-2008 09:28
22-Apr-2008 09:28
jukea
19-Apr-2008 05:09
19-Apr-2008 05:09
concerning Adam Tylmad's code, note that the line
if ($data = xml::cleanString($data))
prevents 0 values to be considered, as this will evaluate to false. I just tracked down this bug in our system .. ouch
wickedfather at hotmail dot com
12-Apr-2008 11:36
12-Apr-2008 11:36
To beaudurrant - that class is great and structures things in a very sensible way. Only problem is that it raises an error if a tag is empty, so would suggest a simple mod to the parse method just to add an isset test.
if (isset($val['value']))
{
$this->setArrayValue($this->output, $stack, $val['value']);
}
mathiasrav at gmail dot com
11-Mar-2008 12:42
11-Mar-2008 12:42
In response to Anonymous' post at 26-Feb-2008 11:50:
Saying that you "don't understand everything" isn't going to get you very popular - you should understand the code you use.
foreach isn't *slow* in PHP, it is actually faster than the equivalent for-construct (which, in many cases, isn't available).
The reason your script is slow is simply your use of xml_parse_into_struct - it reads the whole XML-string and doesn't return until it has parsed and validated it all. If you're looking for efficiency, you'll have to use the more low-level xml_parser_create, xml_set_*_handler functions. Then make sure you don't keep everything in a huge array before outputting it (at least don't if you're going for efficiency).
Anonymous
26-Feb-2008 06:50
26-Feb-2008 06:50
Hi, I actually use this parser without understanding everything. I read somewhere that using "foreach" is very slow, and I indeed noticed that this parser was slow, when getting a lot of data, so how should I edit it to make it faster ? (with the exact same output) thanks in advance
$xml_parser = xml_parser_create();
$data = $outputone;
xml_parse_into_struct($xml_parser, $data, $vals, $index);
xml_parser_free($xml_parser);
$params = array();
$level = array();
$i="1";
foreach ($vals as $xml_elem) {
if ($xml_elem['type'] == 'open' && $xml_elem['level'] == '1') {
$level[$xml_elem['level']] = $xml_elem['tag'];
}
if ($xml_elem['type'] == 'open' && $xml_elem['level'] == '2') {
$level[$xml_elem['level']] = $xml_elem['tag']."".$i;
$i++;
}
if ($xml_elem['type'] == 'complete') {
$start_level = 1;
$php_stmt = '$params';
while($start_level < $xml_elem['level']) {
$php_stmt .= '[$level['.$start_level.']]';
$start_level++;
}
$php_stmt .= '[$xml_elem[\'tag\']] = $xml_elem[\'value\'];';
eval($php_stmt);
}
}
cesaralcaide at gmail dot com
10-Jan-2008 06:35
10-Jan-2008 06:35
I didn't fount an appropiate xml2array translaction for my purpose, so I wrote this:
(convert an xml string to an associative array allowing multiple elements with the same name)
/////////////////////////////////// Inicio XML
//
//
//
// Convierte un XML en un array asociativo cuyos elementos son arrays
// (para permitir varios elementos del mismo nombre)
//
// Limitación: el elemento "attributos" no puede aparecer en el XML, pues corresponde al
// de attribs de un tag (un Tag empieza con "T",no "t")
//
////////////////////////////////////
function xml_analiza($xml) {
global $xml_resul,$xml_n,$xml_cont,$xml_attr;
$xml_n = 0;
$xml_resul = array();
$xml_cont = array();
$xml_attr = array();
$p = xml_parser_create();
//Si quisiéramos distinguir Mayúsculas-Minúsculas en los nombres de etiquetas:
xml_parser_set_option($p, XML_OPTION_CASE_FOLDING, 0);
xml_set_element_handler($p, "xml_inicio", "xml_fin");
xml_set_character_data_handler($p, "xml_char");
// Recorta hasta el primer tag del XML:
$i = instr($xml,"<" . "?xml");
if (!$i) return array();
$j = instr($i,$xml,"?" . ">");
if (!$j) return array();
$xml = substr($xml,$j+2);
if (!xml_parse($p, $xml))
alarma("URG","xml_analiza","XML error: " . xml_error_string(xml_get_error_code($p))
. " en la línea " . xml_get_current_line_number($p) . " ($xml)");
xml_parser_free($p);
if (!sizeof($xml_resul)) return array();
return $xml_resul[0];
}
function xml_inicio($p, $nombre, $atributos) {
global $xml_resul,$xml_n,$xml_cont,$xml_attr;
$xml_n++;
$xml_resul[] = array();
$xml_cont[] = "";
$xml_attr[] = $atributos;
}
function xml_fin($p, $nombre) {
global $xml_resul,$xml_n,$xml_cont,$xml_attr;
$xml_n--;
$nuevo = array_pop($xml_resul);
$nombre = $nombre;
if ($nombre == "attributos") alarma("URG","xml_analiza","Tag con nombre no permitido (attributos)");
$conte = array_pop($xml_cont);
$attrib = array_pop($xml_attr);
if ($conte) $xml_resul[$xml_n][$nombre][] = $conte;
else {
if ($nuevo) {
if ($attrib) $nuevo["attributos"][] = $attrib;
$xml_resul[$xml_n][$nombre][] = $nuevo;
}
else
$xml_resul[$xml_n][$nombre][] = "";
}
}
function xml_char($p, $data) {
global $xml_cont;
$xml_cont[sizeof($xml_cont)-1] .= trim(str_replace("\n","",$data));
}
function xml_a($a) {
// Devuelve un elemento del array XML, p.e.: xml_a($v,"FichaCircuito","Red",0)
$n = func_num_args();
for ($i=1;$i<$n;$i++) {
$b = func_get_arg($i);
if (isset($a[$b]))
$a = $a[$b];
else {
if (!isset($a[0][$b][0])) return "";
$a = $a[0][$b];
}
}
return $a;
}
beaudurrant at gmail dot com
20-Dec-2007 03:23
20-Dec-2007 03:23
This is extending what Alf Marius Foss Olsen had posted above.
It takes into account array keys with the same name and uses an increment for them instead of overwriting the keys.
I am using it for SOAP requests (20K - 150K) and it parses very fast compared to PEAR.
<?
class XMLParser {
// raw xml
private $rawXML;
// xml parser
private $parser = null;
// array returned by the xml parser
private $valueArray = array();
private $keyArray = array();
// arrays for dealing with duplicate keys
private $duplicateKeys = array();
// return data
private $output = array();
private $status;
public function XMLParser($xml){
$this->rawXML = $xml;
$this->parser = xml_parser_create();
return $this->parse();
}
private function parse(){
$parser = $this->parser;
xml_parser_set_option($parser, XML_OPTION_CASE_FOLDING, 0); // Dont mess with my cAsE sEtTings
xml_parser_set_option($parser, XML_OPTION_SKIP_WHITE, 1); // Dont bother with empty info
if(!xml_parse_into_struct($parser, $this->rawXML, $this->valueArray, $this->keyArray)){
$this->status = 'error: '.xml_error_string(xml_get_error_code($parser)).' at line '.xml_get_current_line_number($parser);
return false;
}
xml_parser_free($parser);
$this->findDuplicateKeys();
// tmp array used for stacking
$stack = array();
$increment = 0;
foreach($this->valueArray as $val) {
if($val['type'] == "open") {
//if array key is duplicate then send in increment
if(array_key_exists($val['tag'], $this->duplicateKeys)){
array_push($stack, $this->duplicateKeys[$val['tag']]);
$this->duplicateKeys[$val['tag']]++;
}
else{
// else send in tag
array_push($stack, $val['tag']);
}
} elseif($val['type'] == "close") {
array_pop($stack);
// reset the increment if they tag does not exists in the stack
if(array_key_exists($val['tag'], $stack)){
$this->duplicateKeys[$val['tag']] = 0;
}
} elseif($val['type'] == "complete") {
//if array key is duplicate then send in increment
if(array_key_exists($val['tag'], $this->duplicateKeys)){
array_push($stack, $this->duplicateKeys[$val['tag']]);
$this->duplicateKeys[$val['tag']]++;
}
else{
// else send in tag
array_push($stack, $val['tag']);
}
$this->setArrayValue($this->output, $stack, $val['value']);
array_pop($stack);
}
$increment++;
}
$this->status = 'success: xml was parsed';
return true;
}
private function findDuplicateKeys(){
for($i=0;$i < count($this->valueArray); $i++) {
// duplicate keys are when two complete tags are side by side
if($this->valueArray[$i]['type'] == "complete"){
if( $i+1 < count($this->valueArray) ){
if($this->valueArray[$i+1]['tag'] == $this->valueArray[$i]['tag'] && $this->valueArray[$i+1]['type'] == "complete"){
$this->duplicateKeys[$this->valueArray[$i]['tag']] = 0;
}
}
}
// also when a close tag is before an open tag and the tags are the same
if($this->valueArray[$i]['type'] == "close"){
if( $i+1 < count($this->valueArray) ){
if( $this->valueArray[$i+1]['type'] == "open" && $this->valueArray[$i+1]['tag'] == $this->valueArray[$i]['tag'])
$this->duplicateKeys[$this->valueArray[$i]['tag']] = 0;
}
}
}
}
private function setArrayValue(&$array, $stack, $value){
if ($stack) {
$key = array_shift($stack);
$this->setArrayValue($array[$key], $stack, $value);
return $array;
} else {
$array = $value;
}
}
public function getOutput(){
return $this->output;
}
public function getStatus(){
return $this->status;
}
}
?>
Usage:
$p = new XMLParser($xml);
$p->getOutput();
php dot net at crazedsanity dot com
24-Oct-2007 10:32
24-Oct-2007 10:32
There's an updated version of cs-phpxml (http://sf.net/projects/cs-phpxml, or https://cs-phpxml.svn.sourceforge.net/svnroot/cs-phpxml/releases for the latest out of subversion) which easily converts an XML string into a PHP array. Using my previous example:::
<?php
/**
*
* *********** EXAMPLE ***********
*
* Original file contents:
* <test xmlns="stuff">
* <indexOne>hello</indexOne>
* <my_single_index testAttribute="hello" />
* <multiple_items>
* <item>1</item>
* <item>2</item>
* </multiple_items>
* </test>
*
* Would return:::
*
* array(
* TEST => array(
* indexOne => hello,
* my_single_index => NULL,
* multiple_items => array(
* items => array(
* 0 => 1,
* 1 => 2
* )
* ),
* ),
* );
*/
?>
I've been using this in many production environments, and it's been very stable. The syntax is pretty simple, too:::
<?php
require_once(dirname(__FILE__) ."/cs-phpxml/xmlParserClass.php");
$xmlParser = new xmlParser(file_get_contents("test.xml"));
$myArray = $xmlParser->get_tree(TRUE);
?>
Alf Marius Foss Olsen
12-Sep-2007 09:46
12-Sep-2007 09:46
<?php
/*
An easy lightweight (Array ->) XML -> Array algorithm..
Typical case: You have an array you want to export to an external server,
so you make XML out of it, exports it, and "on the other side"
make it into an array again. These two functions will take care
of that last part, ie XML -> Array
NOTE! The function XMLToArray assumes that the XML _dont_ have nodes on the
same level with the same name, then it just wont work. This is not a
problem, as this case deals with Array -> XML -> Array, and an array
cant have to identical indexes/keys.
*/
function XMLToArray($xml) {
$parser = xml_parser_create('ISO-8859-1'); // For Latin-1 charset
xml_parser_set_option($parser, XML_OPTION_CASE_FOLDING, 0); // Dont mess with my cAsE sEtTings
xml_parser_set_option($parser, XML_OPTION_SKIP_WHITE, 1); // Dont bother with empty info
xml_parse_into_struct($parser, $xml, $values);
xml_parser_free($parser);
$return = array(); // The returned array
$stack = array(); // tmp array used for stacking
foreach($values as $val) {
if($val['type'] == "open") {
array_push($stack, $val['tag']);
} elseif($val['type'] == "close") {
array_pop($stack);
} elseif($val['type'] == "complete") {
array_push($stack, $val['tag']);
setArrayValue($return, $stack, $val['value']);
array_pop($stack);
}//if-elseif
}//foreach
return $return;
}//function XMLToArray
function setArrayValue(&$array, $stack, $value) {
if ($stack) {
$key = array_shift($stack);
setArrayValue($array[$key], $stack, $value);
return $array;
} else {
$array = $value;
}//if-else
}//function setArrayValue
// USAGE:
$xml = <<<QQQ
<?xml version="1.0"?>
<root>
<node1>Some text</node1>
<node2a>
<node2b>
<node2c>Some text</node2c>
</node2b>
</node2a>
</root>\n
QQQ;
$array = XMLToArray($xml);
print "<pre>";
print_r($array);
print "</pre>";
// Output:
//
// Array
// (
// [root] => Array
// (
// [node1] => Some text
// [node2a] => Array
// (
// [node2b] => Array
// (
// [node2c] => Some text
// )
// )
// )
// )
?>
vinod at citadel-soft dot com
01-Sep-2007 06:43
01-Sep-2007 06:43
My previous code was having some bugs in, that is fixed now
<?php
class CSLXmlReader {
private $tagstack;
private $xmlvals;
private $xmlvarArrPos;
private $xmlfile;
function __construct($filename) // constructor to intialize the stack and val array
{
$this->tagstack = array(); // contain the open tags till now
$this->xmlvals = array();
$this->xmlvarArrPos = $this->xmlvals; // temporary variable to hold the current tag position
$this->xmlfile = $filename;
}
function readDatabase()
{
// read the XML database
$data = implode("", file($this->xmlfile));
$parser = xml_parser_create();
xml_parser_set_option($parser, XML_OPTION_CASE_FOLDING, 0);
xml_parser_set_option($parser, XML_OPTION_SKIP_WHITE, 1);
xml_parse_into_struct($parser, $data, $values, $tags);
xml_parser_free($parser);
foreach($values as $key => $val) //
{
if($val['type'] == "open")
{
array_push($this->tagstack, $val['tag']);
$this->getArrayPath();
if(count($this->xmlvarArrPos) > 0 && (!array_key_exists(0,$this->xmlvarArrPos)))
{
$temp1 = $this->xmlvarArrPos;
$this->xmlvarArrPos = array();
$this->xmlvarArrPos[0] = $temp1;
array_push($this->tagstack, 1);
}
else if( array_key_exists(0,$this->xmlvarArrPos)){
$opncount = count($this->xmlvarArrPos);
array_push($this->tagstack, $opncount);
}
$tagStackPointer += 1;
}else if($val['type'] == "close")
{
while( $val['tag'] != ($lastOpened = array_pop($this->tagstack))){}
}else if($val['type'] == "complete")
{
$this->getArrayPath();
if( array_key_exists($val['tag'],$this->xmlvarArrPos))
{
if(array_key_exists(0,$this->xmlvarArrPos[$val['tag']]))
{
$elementCount = count($this->xmlvarArrPos[$val['tag']]);
$this->xmlvarArrPos[$val['tag']][$elementCount] = $val['value'];
}else
{
$temp1 = $this->xmlvarArrPos[$val['tag']];
$this->xmlvarArrPos[$val['tag']] = array();
$this->xmlvarArrPos[$val['tag']][0] = $temp1;
$this->xmlvarArrPos[$val['tag']][1] = $val['value'];
}
} else
{
$this->xmlvarArrPos[$val['tag']] = $val['value'];
}
}
}
reset($this->xmlvals);
return $this->xmlvals;
}
function getArrayPath()
{
reset($this->xmlvals);
$this->xmlvarArrPos = &$this->xmlvals;
foreach($this->tagstack as $key)
{
$this->xmlvarArrPos = &$this->xmlvarArrPos[$key];
}
}
}
$readerObj = new CSLXmlReader("test.xml");
$xmlvals = $readerObj->readDatabase();
echo "########## XML Values In array as the multidimentional array #############\n";
echo "<pre>";
print_r($xmlvals);
echo "</pre>";
?>
php dot net at crazedsanity dot com
14-Jul-2007 01:43
14-Jul-2007 01:43
If you're interested in something that creates arrays in PHP, handles attributes well, and is easily transferrable back into XML, you may want to take a look at the cs-phpxml project at SourceForge.net (http://sf.net/projects/cs-phpxml). It's not necessarily documented very well, but it will do something like this:
<?php
/**
*
* *********** EXAMPLE ***********
*
* Original file contents:
* <test xmlns="stuff">
* <indexOne>hello</indexOne>
* <my_single_index testAttribute="hello" />
* <multiple_items>
* <item>1</item>
* <item>2</item>
* </multiple_items>
* </test>
*
* Would return:
*
* array(
* TEST => array(
* type => 'open',
* attributes => array(
* xmlns => 'stuff'
* )
* INDEXONE => 'hello',
* MY_SINGLE_INDEX = array(
* type => 'complete',
*
* )
* )
* );
*/
?>
It's presently under development, but I'm using it in several production environments. The XMLCreator is kinda clunky (builds XML within PHP code). NOTE: it has a dependency on "cs-arraytopath", also available at sourceforge via http://sf.net/projects/cs-arraytopath . The setup is a bit irritating, and it's fragile when handling quoting & formatting the data, but I think it's worth the hassle for most projects.
siteres at gmail dot com
07-Feb-2007 05:38
07-Feb-2007 05:38
PHP: XML to Array and backwards:
Here the XML with PHP solution: XML->Array and Array->XML.
Work with it as with usual array.
Sources are here:
http://mysrc.blogspot.com/2007/02/php-xml-to-array-and-backwards.html
(leave me comments:)
Example #1 (1.xml):
<ddd>
<onemore dd="55">
<tt>333</tt>
<tt ss="s1">555</tt>
<tt>777</tt>
</onemore>
<two>sdf rr</two>
</ddd>
The code:
$xml=xml2ary(file_get_contents('1.xml'));
print_r($xml);
Here is the Array result:
Array
(
[ddd] => Array (
[_c] => Array (
[_p] => Array *RECURSION*
[onemore] => Array (
[_a] => Array (
[dd] => 55
)
[_c] => Array (
[_p] => Array *RECURSION*
[tt] => Array (
[0] => Array (
[_v] => 333
)
[1] => Array (
[_a] => Array (
[ss] => s1
)
[_v] => 555
)
[2] => Array (
[_v] => 777
)
)
)
)
[two] => Array (
[_v] => sdf rr
)
)
)
)
amish
09-Jan-2007 01:21
09-Jan-2007 01:21
Previous parser worked great for me, except a few issues. It did not work well if the element has attributes. I had a huge xml with so many elements, and attributes. Somehow, it mixed up my array, and messed up the keys. Hope following code help fix those issues....
$xml_response = "
<test>
<item>
<name att=\"this should show up\">Item1</name>
<id>item_1</id>
<description> This is Item 1</description>
<quantity>10</quantity>
<navigation website='site1'>test1</navigation>
<navigation website='site2'>test2</navigation>
</item>
</test>
"
$parser = xml_parser_create();
xml_parser_set_option($parser,XML_OPTION_CASE_FOLDING,0);
xml_parser_set_option($parser,XML_OPTION_SKIP_WHITE,1);
xml_parse_into_struct($parser,$xml_response,$values,$tags);
xml_parser_free($parser);
$params = array();
$level = array();
foreach ($values as $xml_elem) {
$start_level = 1;
if ($xml_elem['type'] == 'open') {
if (array_key_exists('attributes',$xml_elem)) {
list($level[$xml_elem['level']],$extra) = array_values($xml_elem['attributes']);
} else {
$level[$xml_elem['level']] = $xml_elem['tag'];
}
$name_array = array();
$i=0;
}
if ($xml_elem['type'] == 'complete') {
if(!in_array($xml_elem['tag'], $name_array)){
array_push($name_array, $xml_elem['tag']);
$i=0;
}else{
$i++;
}
$php_stmt = '$params';
while($start_level < $xml_elem['level']) {
$php_stmt .= '[$level['.$start_level.']]';
$test = $php_stmt;
$start_level++;
}
$php_stmt .= '[$xml_elem[\'tag\']][$i] = $xml_elem[\'value\'];';
if(isset($xml_elem['attributes'])){
foreach ($xml_elem['attributes'] as $key=>$va){
$attribute = '';
$new_stmt = '';
$attribute = "".$i."_attribute_".$key."";
$new_stmt .= $test.'[$xml_elem[\'tag\']][$attribute] = $va;';
eval($new_stmt);
}
}
eval($php_stmt);
$start_level--;
}
if ($xml_elem['type'] == 'close') {
array_pop($level);
}
}
webmaster at after5webdesign dot com
30-Nov-2006 08:48
30-Nov-2006 08:48
That parser also has a problem in which it will not parse more items than the current level it is on. That is, parsing this: <1><2>A</2><2>B</2><2>C</2></1>
Will only result in this: A B
C is never processed.
It might be better with something like this:
$file = get_url('http://news.search.yahoo.com/news/rss?p=current+events', URL_CONTENT);
$data = $file['content'];
$xml_parser = xml_parser_create();
xml_parse_into_struct($xml_parser, $data, $vals, $index);
xml_parser_free($xml_parser);
//Uncomment the lines below to see the entire structure of your XML document
//echo "<pre>INDEX: \n";
//print_r ($index);
//echo "\n \n \n VALUES:";
//print_r ($vals);
//echo "</pre>";
$params = array();
$level = array();
$start_level = 1;
foreach ($vals as $xml_elem) {
if ($xml_elem['type'] == 'open') {
if (array_key_exists('attributes',$xml_elem)) {
list($level[$xml_elem['level']],$extra) = array_values($xml_elem['attributes']);
} else {
$level[$xml_elem['level']] = $xml_elem['tag'];
}
}
if ($xml_elem['type'] == 'complete') {
$php_stmt = '$params';
while($start_level < $xml_elem['level']) {
$php_stmt .= '[$level['.$start_level.']]';
$start_level++;
}
$php_stmt .= '[$xml_elem[\'tag\']][] = $xml_elem[\'value\'];';
eval($php_stmt);
$start_level--;
}
}
echo "<pre>";
print_r ($params);
echo "</pre>";
~Tim_Myth
tsivert
17-Nov-2006 07:06
17-Nov-2006 07:06
To John.
The reason that you only get the last item is that you declare a array of one element that is constantly overwritten by the last element...
I don't know if you want to put the items as two different childarrays of the parent or if you want to put the items in one childarray with two elements.
To put the items in two different childarrays, change the line
$php_stmt .= '[$level['.$start_level.']]';
to
$php_stmt .= '[$level['.$start_level.']][]';
To put in same childarray change line
$php_stmt .= '[$xml_elem[\'tag\']] = $xml_elem[\'value\'];';
to
$php_stmt .= '[$xml_elem[\'tag\']][] = $xml_elem[\'value\'];';
Hope this helps you!
tsivert
john
11-Nov-2006 03:15
11-Nov-2006 03:15
I'm currently using this parser and it's working the way I want it to, but it has a little glitch and I was hoping maybe someone can let me know why.
Here's the parser and use for example purposes the following input:
$xml_response = '<?xml version="1.0" encoding="UTF-8"?>
<test>
<item>First Item</item>
<item>Second Item</item>
</test>'
$xml_parser = xml_parser_create();
xml_parse_into_struct($xml_parser, $xml_response, $vals, $index);
xml_parser_free($xml_parser);
$params = array();
$level = array();
foreach ($vals as $xml_elem) {
if ($xml_elem['type'] == 'open') {
if (array_key_exists('attributes',$xml_elem)) {
list($level[$xml_elem['level']],$extra) = array_values($xml_elem['attributes']);
} else {
$level[$xml_elem['level']] = $xml_elem['tag'];
}
}
if ($xml_elem['type'] == 'complete') {
$start_level = 1;
$php_stmt = '$params';
while($start_level < $xml_elem['level']) {
$php_stmt .= '[$level['.$start_level.']]';
$start_level++;
}
$php_stmt .= '[$xml_elem[\'tag\']] = $xml_elem[\'value\'];';
eval($php_stmt);
}
}
echo "<pre>";
print_r ($params);
echo "</pre>";
At the output, only the last <item> shows (i.e.Second Item). The first one is lost.
What should I change so it keeps ALL <item> tags.
A3
04-Nov-2006 09:28
04-Nov-2006 09:28
XML -> Array
<?
$data = '<root><a><b x="s" a="2">asdf</b><c></c></a></root>';
$p = xml_parser_create();
xml_parse_into_struct($p, $data, $vals);
xml_parser_free($p);
$key = $output = array();
foreach ($vals as $id=>$item) {
if ($item["type"]=="open" || $item["level"]>count($key)) {// && count($key)<=$item["level"])
array_push($key, $id);
$temp = array("tag"=>$item["tag"], "value"=>"", "attributes"=>array());
eval("\$output[".implode("][", $key)."] = \$temp;");
}
if ($item["type"]=="close" || $item["level"]<count($key))// && $item["level"]>=count($key))
array_pop($key);
if (isset($item["attributes"]))
eval("\$output[".implode("][", $key)."]['attributes'] = array_merge(\$output[".implode("][", $key)."]['attributes'], \$item['attributes']);");
if (isset($item["value"]))
eval("\$output[".implode("][", $key)."]['value'] .= \$item['value'];");
}
?>
Elad Elrom
13-Sep-2006 05:14
13-Sep-2006 05:14
This is a quick fix for parsing XML from remote URL, some of the example above will work when trying to parse on your local server without "http://" but not when trying to parse from remote server using "http://www.URL"...
<?
$file="http://www.URL.com/file.XML";
$xml_parser = xml_parser_create();
$handle = fopen($file, "rb");
$contents = '';
while (!feof($handle)) {
$data .= fread($handle, 8192);
}
fclose($handle);
xml_parse_into_struct($xml_parser, $data, $vals, $index);
xml_parser_free($xml_parser);
$params = array();
$level = array();
foreach ($vals as $xml_elem) {
if ($xml_elem['type'] == 'open') {
if (array_key_exists('attributes',$xml_elem)) {
list($level[$xml_elem['level']],$extra) = array_values($xml_elem['attributes']);
} else {
$level[$xml_elem['level']] = $xml_elem['tag'];
}
}
if ($xml_elem['type'] == 'complete') {
$start_level = 1;
$php_stmt = '$params';
while($start_level < $xml_elem['level']) {
$php_stmt .= '[$level['.$start_level.']]';
$start_level++;
}
$php_stmt .= '[$xml_elem[\'tag\']] = $xml_elem[\'value\'];';
eval($php_stmt);
}
}
echo "<pre>";
print_r ($params);
echo "</pre>";
?>
mad dot cat at mcmadcat dot com
06-Sep-2006 07:55
06-Sep-2006 07:55
this my love function:
<?php
function mc_parse_xml($filename)
{
$xml = file_get_contents($filename);
$p = xml_parser_create();
xml_parse_into_struct($p, $xml, $values, $index);
xml_parser_free($p);
for ($i=0;$i<count($values);$i++) {
if (isset($values[$i]['attributes'])) {
$parent = $values[$i]['tag'];
$keys = array_keys($values[$i]['attributes']);
for ($z=0;$z<count($keys);$z++)
{
$content[$parent][$i][$keys[$z]] = $values[$i]['attributes'][$keys[$z]];
if (isset($content[$parent][$i]['VALUE'])) $content[$parent][$i]['VALUE'] = $values[$i]['value'];
}
}
}
foreach ($content as $key => $values) {
$content[$key] = array_values($content[$key]);
}
if (is_array($content)) return $content;
else return false;
}
?>
webmaster at unitedscripters dot com
17-Jul-2006 08:29
17-Jul-2006 08:29
Ps keep in mind that some Rss feeds include spurious tags as... html entities (see Google news Rss feeds: they include tables as <table blah blah!).
If so, in my rssSnapper below add this:
<?php
$input=preg_replace("/(<!\\[CDATA\\[)|(\\]\\]>)/", '', $input);
$input=html_entity_decode($input); //<-- added line
?>
You may play around with the code and make it perfect, testing it on various feeds. Not _all_ XML is worth of an XML parser and the sleepless nights it entails.
webmaster at unitedscripters dot com
17-Jul-2006 05:43
17-Jul-2006 05:43
It may be not entirely immaterial to stress that when you are dealing with incoming XML files such as RSS feeds, and you are about to include several of them in some page of yours, resorting to the PHP XML oriented functions is neither _necessarily_ the best idea, nor it is _strictly_ indispensable.
I have in mind, here, also a note that time ago was on this documentation by some info at gramba dot tv:
QUOTE
I was working with the xml2array functions below and had big performance problems. I fired them on a 20MB XML file and had to quit since all approaches of parsing where just too slow (more than 20 Minute parsing etc..). The solution was parsing it manually with preg_match, which increased performance by more than 20 times (processing time about 1 minute).
UNQUOTE
Calling in a specific XML structure function, and arranging a whole class, when all you want from an incoming files may be the contents of a few tags, is not the only option you are left with, when you are at PHP.
Here is a simple function that parses a XML RSS feed using no XML oriented function: keeping this in mind may spare you the need to create extremely complex classes as the ones we see here when _all_ you may want is a few titles and descriptions from an RSS (if that's your goal, you don't need XML parsers):
<?php
function rssSnapper($input='', $limit=0, $feedChannel='Yahoo!News'){
$input=file_get_contents($input);
if(!$input){return '';};
$input=preg_replace("/[\\n\\r\\t]+/", '', $input);
$input=preg_replace("/(<!\\[CDATA\\[)|(\\]\\]>)/", '', $input);
preg_match_all("/<item>(.*?)<\\/item>/", $input, $items, PREG_SET_ORDER);
$limit=(int)$limit;
$limit=($limit && is_numeric($limit) && abs($limit)<sizeof($items))? sizeof($items)-abs($limit): 0;
while(sizeof($items)>$limit){
$item=array_shift($items);
$item=$item[1];
preg_match_all("/<link>(.*?)<\\/link>/", $item, $link, PREG_SET_ORDER);
preg_match_all("/<title>(.*?)<\\/title>/", $item, $title, PREG_SET_ORDER);
preg_match_all("/<author>(.*?)<\\/author>/", $item, $author, PREG_SET_ORDER);
preg_match_all("/<pubDate>(.*?)<\\/pubDate>/", $item, $pubDate, PREG_SET_ORDER);
preg_match_all("/<description>(.*?)<\\/description>/", $item, $description, PREG_SET_ORDER);
if(sizeof($link)){ $link = strip_tags($link[0][1]); };
if(sizeof($title)){ $title = strtoupper( strip_tags($title[0][1]) ); };
if(sizeof($author)){ $author = strip_tags($author[0][1]); };
if(sizeof($pubDate)){ $pubDate = strip_tags($pubDate[0][1]); };
if(sizeof($description)){ $description = strip_tags($description[0][1]); };
print <<<USAVIT
<!-- ITEM STARTS -->
<div class="news_bg_trick">
<a href="$link" class="item" target="_blank">
<span class="title">$title<span class="channel">$feedChannel</span></span>
<span class="title_footer">
by <span class="author">$author</span> -
<span class="date">$pubDate</span>
</span>
<span class="description">$description</span>
</a>
</div>
<!-- ITEM ENDS -->
USAVIT;
}//out of loop
/*unitedscripters.com*/}
?>
The printing phase assigns Css class names: the output is thus fully customizable by a mere style sheet.
The use of strip_tags is a reminder from Chris Shiflett: distrust incoming data, always, anyway.
I hope no typos slipped in in transcription. Arguably not perfect, but I hope a good alternative idea to spending three days on a full fledged XML parser just to grab... three tags from a RSS feed!
bye, ALberto
efredricksen at gmail dot com
24-May-2006 03:55
24-May-2006 03:55
Perhaps the one true parser:? I modified xademax's fine code to tidy it up, codewise and style wise, rationalize some minor crazyness, and make names fit nomenclature from the XML spec. (There are no uses of eval, and shame on you people who do.)
<?php
class XmlElement {
var $name;
var $attributes;
var $content;
var $children;
};
function xml_to_object($xml) {
$parser = xml_parser_create();
xml_parser_set_option($parser, XML_OPTION_CASE_FOLDING, 0);
xml_parser_set_option($parser, XML_OPTION_SKIP_WHITE, 1);
xml_parse_into_struct($parser, $xml, $tags);
xml_parser_free($parser);
$elements = array(); // the currently filling [child] XmlElement array
$stack = array();
foreach ($tags as $tag) {
$index = count($elements);
if ($tag['type'] == "complete" || $tag['type'] == "open") {
$elements[$index] = new XmlElement;
$elements[$index]->name = $tag['tag'];
$elements[$index]->attributes = $tag['attributes'];
$elements[$index]->content = $tag['value'];
if ($tag['type'] == "open") { // push
$elements[$index]->children = array();
$stack[count($stack)] = &$elements;
$elements = &$elements[$index]->children;
}
}
if ($tag['type'] == "close") { // pop
$elements = &$stack[count($stack) - 1];
unset($stack[count($stack) - 1]);
}
}
return $elements[0]; // the single top-level element
}
// For example:
$xml = '
<parser>
<name language="en-us">Fred Parser</name>
<category>
<name>Nomenclature</name>
<note>Noteworthy</note>
</category>
</parser>
';
print_r(xml_to_object($xml));
?>
will give:
xmlelement Object
(
[name] => parser
[attributes] =>
[content] =>
[children] => Array
(
[0] => xmlelement Object
(
[name] => name
[attributes] => Array
(
[language] => en-us
)
[content] => Fred Parser
[children] =>
)
[1] => xmlelement Object
(
[name] => category
[attributes] =>
[content] =>
[children] => Array
(
[0] => xmlelement Object
(
[name] => name
[attributes] =>
[content] => Nomenclature
[children] =>
)
[1] => xmlelement Object
(
[name] => note
[attributes] =>
[content] => Noteworthy<