Situatie
Extragere continut din HTML, utilizand nokogiri si cum sa salvati acest continut intr-un fisier text(.txt).
Solutie
Codul Ruby aferent:
require ‘nokogiri’
doc = File.open(“/Users/admin/Desktop/test.html”) { |f| Nokogiri::HTML(f) }
puts “### Scrap”
begin
file = File.open(“/Users/admin/Desktop/tags.txt”, “w”)
#use css selector to target the node that contains content
doc.css(‘span.select-menu-item-text.js-select-button-text.js-navigation-open’).each do |span|
puts span.content
file.write(span.content.to_s + “\n”)
end
rescue IOError => e
#some error occur, dir not writable etc.
ensure
file.close unless file.nil?
end
Leave A Comment?