由买买提看人间百态

boards

本页内容为未名空间相应帖子的节选和存档,一周内的贴子最多显示50字,超过一周显示500字 访问原贴
Programming版 - python regexp question
相关主题
同主题转寄 (转载)python+ beautifulsoup 爬网页怎么那么复杂?
问个用python scratch yelp html 数据的问题Python做计算怎么只用一个核?
请教html中的href那位大侠介绍一下python的webcrawler吧
XSL question有啥好轮子可以抓取网页里的不规则信息?
correct indentation in python ?how to print 2 exponential digits in windows by using Perl
这个regular expression应该怎么写这里的人用BOOST都是用来做什么?
python urlopen(), how to go back to the beginning after readlines()which Regular Expression lib in C++ do you prefer?
请问哪里有python的code example[合集] c#整个就是个script language now,将来谁都能写
相关话题的讨论汇总
话题: abt话题: td话题: href话题: matchobj话题: nofollow
进入Programming版参与讨论
1 (共1页)

i did this, not working,
matchObj = re.search(r"", eachLine)

if matchObj:
print matchObj.group(3)
Many thanks !'
s2 = re.sub('^
: i did this, not working,
: matchObj = re.search(r"", eachLine)
:
: if matchObj:
: print matchObj.group(3)
: Many thanks !
'
: s2 = re.sub('^'
In [13]: import BeautifulSoup
In [14]: bs=BeautifulSoup.BeautifulSoup(s)
In [15]: bs.a.contents
Out[15]: [u'ABT']



【在 w*s 的大作中提到】
: for a line like this, how to get ABT ?
:

: i did this, not working,
: matchObj = re.search(r"", eachLine)
:
: if matchObj:
: print matchObj.group(3)
: Many thanks !
'
>>> import re
>>> patt = r''
>>> re.search(patt,s).group(1)
'ABT'

【在 V*********r 的大作中提到】
: In [12]: s = ''
: In [13]: import BeautifulSoup
: In [14]: bs=BeautifulSoup.BeautifulSoup(s)
: In [15]: bs.a.contents
: Out[15]: [u'ABT']
:
:
'
matchObj = re.search(r''
: matchObj = re.search(r'
w*s
发帖数: 7227
1
for a line like this, how to get ABT ?
http://www.nyse.com/about/listed/lcddata.html?ticker=abt">ABT*
d****i
发帖数: 4809
2
try this:
s = '
http://www.nyse.com/about/listed/lcddata.html?ticker=abt">ABT', '', s)
s3 = re.sub(' print s3



【在 w*s 的大作中提到】
: for a line like this, how to get ABT ?
:
http://www.nyse.com/about/listed/lcddata.html?ticker=abt">ABT*
w*s
发帖数: 7227
3
too complicated, i was expecting 1 line, but thanks !

【在 d****i 的大作中提到】
: try this:
: s = '
http://www.nyse.com/about/listed/lcddata.html?ticker=abt">ABT', '', s)
: s3 = re.sub(': print s3
:
:
V*********r
发帖数: 666
4
In [12]: s = '
http://www.nyse.com/about/listed/lcddata.html?ticker=abt">ABT http://www.nyse.com/about/listed/lcddata.html?ticker=abt">ABT*
i***r
发帖数: 1035
5
second this. beautiful soup is really the best solution for any webpage data
i also found this worked:
>>> s='
http://www.nyse.com/about/listed/lcddata.html?ticker=abt">ABT(.*) http://www.nyse.com/about/listed/lcddata.html?ticker=abt">ABT
l**********n
发帖数: 8443
6
how about this:
import re
eachLine = '
http://www.nyse.com/about/listed/lcddata.html?ticker=abt">ABT(.*)<\/a><\/td>', eachLine)
if matchObj:
print matchObj.group(2)
print matchObj.group(1)
d******e
发帖数: 2265
7
>>> m= re.search('(W+)', str)
and if you are not sure:
>>> m= re.search('>(W+)', str)
>>> m.group(0)
'>ABT'
>>> m.group(1)
'ABT'

【在 w*s 的大作中提到】
: too complicated, i was expecting 1 line, but thanks !
w*s
发帖数: 7227
8
this is what i end with as well,
thanks everyone for your reply !

【在 l**********n 的大作中提到】
: how about this:
: import re
: eachLine = '
http://www.nyse.com/about/listed/lcddata.html?ticker=abt">ABT(.*)<\/a><\/td>', eachLine)
: if matchObj:
: print matchObj.group(2)
: print matchObj.group(1)
w*s
发帖数: 7227
9
为了表达谢意,上福利!

【在 w*s 的大作中提到】
: this is what i end with as well,
: thanks everyone for your reply !

1 (共1页)
进入Programming版参与讨论
相关主题
[合集] c#整个就是个script language now,将来谁都能写correct indentation in python ?
how to assign new value to loop variables?这个regular expression应该怎么写
在emacs中怎么按照某种格式进行替换?python urlopen(), how to go back to the beginning after readlines()
怎样用Python选一部分数据出来请问哪里有python的code example
同主题转寄 (转载)python+ beautifulsoup 爬网页怎么那么复杂?
问个用python scratch yelp html 数据的问题Python做计算怎么只用一个核?
请教html中的href那位大侠介绍一下python的webcrawler吧
XSL question有啥好轮子可以抓取网页里的不规则信息?
相关话题的讨论汇总
话题: abt话题: td话题: href话题: matchobj话题: nofollow