安装
在CentOS中,安装 ruby 和 mysql 数据库。
[cc lang=”text”]
# yum install ruby ruby-irb mysql mysql-server ruby-mysql
[/cc]
变量
全局变量用 $ 开头;
实例变量用 @ 开头;
局部变量直接来;
[cc lang=”ruby”]
$global_variable = 10 # 全局变量
@cust_id=id # 实例变量
var=”hehe” #局部变量
[/cc]
方法(函数)
[cc lang=”ruby”]
def method_name [( [arg [= default]]…[, * arg [, &expr ]])]
expr..
end
[/cc]
如果函数不需要参数,直接用名字就能调用。
[cc lang=”ruby”]
method_name
[/cc]
Socket
[cc lang=”ruby”]
require ‘socket’ # Sockets 是标准库
hostname = ‘localhost’
port = 2000
s = TCPSocket.open(hostname, port)
while line = s.gets # 从 socket 中读取每行数据
puts line.chop # 打印到终端
end
s.close # 关闭 socket
[/cc]
HTTP例子
[cc lang=”ruby”]
require ‘socket’
host = ‘www.w3cschool.cc’ # web服务器
port = 80 # 默认 HTTP 端口
path = “/index.htm” # 想要获取的文件地址
# 这是个 HTTP 请求
request = “GET #{path} HTTP/1.0\r\n\r\n”
socket = TCPSocket.open(host,port) # 连接服务器
socket.print(request) # 发送请求
response = socket.read # 读取完整的响应
# Split response at first blank line into headers and body
headers,body = response.split(“\r\n\r\n”, 2)
print body # 输出结果
[/cc]
示例 line1 = “Cats are smarter than dogs”; if ( line1 =~ /Cats(.*)/ ) [cc lang=”ruby”] # Put kw into Database # Get more keywords [cc lang=”ruby”] https://github.com/feichashao/fetch_kw http://rubylearning.com/satishtalim/ruby_socket_programming.html
[cc lang=”ruby”]
#!/usr/bin/ruby
line2 = “Dogs also like meat”;
puts “Line1 contains Cats”
end
if ( line2 =~ /Cats(.*)/ )
puts “Line2 contains Dogs”
end
[/cc]
def content_handle(kw,content,db)
db_result = db.query(“INSERT INTO #{KW_TBL_NAME}(keyword) VALUES(\”#{kw}\”)”)
result_div = /
result_kw = result_div[1].scan(/
# Put keywords into to_visit.
if result_kw.respond_to?(“each”) and @to_visit.length <= MAX_TO_VISIT
result_kw.each do |rkw|
@mutex.lock
@to_visit << rkw
@mutex.unlock
puts "Got kw: #{rkw}\n"
end
end
[/cc]
多线程
# Multi-thread
t1 = Thread.new{fetch()}
t2 = Thread.new{fetch()}
t3 = Thread.new{fetch()}
t4 = Thread.new{fetch()}
t5 = Thread.new{fetch()}
t1.join
t2.join
t3.join
t4.join
t5.join
[/cc]爬虫示例
抓取百度结果和关键字.参考资料
http://www.w3cschool.cc/ruby/ruby-tutorial.html