php将HTML表格每行每列转为数组实现采集表格数据的方法

要将HTML表格的每行每列转为数组,实现采集表格数据,可以采用以下步骤:

要将HTML表格的每行每列转为数组,实现采集表格数据,可以采用以下步骤:

1.首先,根据table标签的id或class属性找到目标表格。

2.通过PHP的DOMDocument类,将HTML代码解析为DOM结构,然后用DOMXPath类查找表格中的每一行。

3.对每一行进行循环遍历,将每个单元格的内容存入关联数组中,并将该数组存入外层的索引数组中。

4.最后,返回整个二维数组。

以下是示例代码:

示例1:

<?php
$html = '<table>
            <tr>
                <td>Name</td>
                <td>Age</td>
                <td>Gender</td>
            </tr>
            <tr>
                <td>John</td>
                <td>25</td>
                <td>Male</td>
            </tr>
            <tr>
                <td>Jane</td>
                <td>30</td>
                <td>Female</td>
            </tr>
        </table>';

$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);

$tableRows = $xpath->query('//table[@class="table"]//tr');

$data = [];

foreach ($tableRows as $row) {
    $rowData = [];
    foreach ($row->getElementsByTagName('td') as $cell) {
        $rowData[] = $cell->nodeValue;
    }
    $data[] = $rowData;
}

print_r($data);
?>

输出结果:

Array
(
    [0] => Array
        (
            [0] => Name
            [1] => Age
            [2] => Gender
        )

    [1] => Array
        (
            [0] => John
            [1] => 25
            [2] => Male
        )

    [2] => Array
        (
            [0] => Jane
            [1] => 30
            [2] => Female
        )

)

示例2:

<?php
$html = '<table id="my-table">
            <tr>
                <td>Product</td>
                <td>Price</td>
            </tr>
            <tr>
                <td>Shoes</td>
                <td>$50</td>
            </tr>
            <tr>
                <td>Shirt</td>
                <td>$20</td>
            </tr>
        </table>';

$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);

$tableRows = $xpath->query('//table[@id="my-table"]//tr');

$data = [];

foreach ($tableRows as $row) {
    $rowData = [];
    foreach ($row->getElementsByTagName('td') as $cell) {
        $rowData[] = $cell->nodeValue;
    }
    $data[] = $rowData;
}

print_r($data);
?>

输出结果:

Array
(
    [0] => Array
        (
            [0] => Product
            [1] => Price
        )

    [1] => Array
        (
            [0] => Shoes
            [1] => $50
        )

    [2] => Array
        (
            [0] => Shirt
            [1] => $20
        )

)

通过以上示例能够看出,对于不同的HTML表格,通过修改XPath表达式可以找到不同的表格,并将其转换为二维数组的形式。

本文标题为:php将HTML表格每行每列转为数组实现采集表格数据的方法

基础教程推荐