nokogiri如何提取领域的一部分(暂停) - nokogiri how to extract part of the field [on hold]

- 此内容更新于:2016-02-01
主题:

我的文本文件:我想要创建一个新文件使用rubygemnokogiri:我使用:不能工作

原文:

My text file like:

<first>1</first><Name>wangli</Name><birthday>19860105</birthday><address>Here</address>
<first>2</first><Name>zhangli</Name><birthday>19870105</birthday><address>Sangdu</address>
<first>3</first><Name>lili</Name><birthday>19880105</birthday><address>Hongkong</address>
<first>4</first><Name>liuli</Name><birthday>19860515</birthday><address>London</address>

I want create a new file with ruby gem nokogiri like:

wangli-Here
zhangli-Sangdu
lili-Hongkong
liuli-London

I used:

require 'nokogiri'
doc = Nokogiri::XML(File.open("file"),nil,"gbk")
puts doc.xpath("/name") + doc.xpath("/address")

can't work

网友:检查Nokogiri备忘单:github.com/sparklemotion/nokogiri/wiki/Cheat-sheet

(原文:Check the Nokogiri cheatsheet: github.com/sparklemotion/nokogiri/wiki/Cheat-sheet)

网友:欢迎来到堆栈溢出。阅读“如何问”,“最小的、完整的和可核查的榜样”和meta.stackoverflow.com/q/261592/128421。代码不显示你的努力,看起来像一个小尝试希望我们会填补这一空白。

(原文:Welcome to Stack Overflow. Read "How to Ask", "Minimal, Complete, and Verifiable example" and meta.stackoverflow.com/q/261592/128421. Your code doesn't show your effort and looks like a minor attempt hoping we'll fill in the blanks.)

网友:你的文本文件实际上是一个XML文档,或者真的是一系列XML片段?如果是第二个它是怎么得到呢?

(原文:Is your text file actually an XML document, or is it really a series of XML fragments? If it's the second how did it get that way?)

解决方案:
因为XML输入的每一行包含一个XML片段,你必须处理每一行一个接一个。此外,您需要使用解析每一行。这是工作的例子:
原文:

Since each line of your input XML contains an XML fragment, you have to process each line one by one. Also, you need to use Nokogiri::XML.fragment to parse each line. Here is working example:

require "nokogiri"

output = File.open("output.txt", "w")

File.open("input.xml", "r") do |f|
    f.each_line do |line|
        frag = Nokogiri::XML.fragment(line)
        output.puts "#{frag.search('Name').text}=#{frag.search('address').text}"
    end
end

output.close
网友:谢谢你很多。但是我怎样才能得到的名字。语言GBK?

(原文:Thank you lots. But how can I get Name.language GBK?)

网友:@Syutran我不确定是什么意思,是什么

(原文:@Syutran I am not sure what Name.langauge mean, and what is GBK)

解决方案:
看起来问题解决了!我改变的文本文件:然后ruby代码
原文:

It looks like the problem solved! I change the text file:

<doc>
<line><first>1</first><Name>wangli</Name><birthday>19860105</birthday><address>Here</address></line>
<line><first>2</first><Name>zhangli</Name><birthday>19870105</birthday><address>Sangdu</address></line>
<line><first>3</first><Name>lili</Name><birthday>19880105</birthday><address>Hongkong</address></line>
<line><first>4</first><Name>liuli</Name><birthday>19860515</birthday><address>London</address></line>
</doc>

and then the ruby code

require 'nokogiri'
doc = Nokogiri::XML(File.open("27065"),nil,"gbk")
doc.xpath("//line").each do |line|
    l.xpath("./name").text + "-" + line.xpath("./address").text
end
网友:它不是必要的修改文件。更好的问题是,为什么是这样的文件结构;它不是有效的XML。

(原文:It's not necessary to modify the file at all. A better question is why is the file structured like that; It's not valid XML.)