------=_NextPart_000_0037_01C3FC35.C7491E70 Content-Type: text/plain;
charset="us-ascii"
Content-Transfer-Encoding: 7bit
This is a flash/central blog but it has a code off like we talked about
at the meeting last night (Jeff referred to the 10 million word document
parser). No real entries in yet but it sounds fun.
http://markme.com/mesh/ (Feb 25, 2004 entry)
John C. Bland II
JDEV Inc.
Business:
http://www.jdevinc.com
------=_NextPart_000_0037_01C3FC35.C7491E70 Content-Type: text/html;
charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
<html>
<head>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; charset=3Dus-ascii"=
>
<meta name=3DGenerator content=3D"Microsoft Word 10 (filtered)">
<style>
<!--
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman";}
a:link, span.MsoHyperlink
{color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{color:#606420;
text-decoration:underline;}
p
{margin-right:0in;
margin-left:0in;
font-size:12.0pt;
font-family:"Times New Roman";}
span.EmailStyle17
{font-family:Arial;
color:windowtext;}
@page Section1
{size:8.5in 11.0in;
margin:1.0in 1.25in 1.0in 1.25in;}
div.Section1
{page:Section1;}
-->
</style>
</head>
<body lang=3DEN-US link=3Dblue vlink=3D"#606420">
<div class=3DSection1>
<p class=3DMsoNormal><font size=3D2 face=3DArial><span style=3D'font-size:1=
0.0pt;
font-family:Arial'>This is a flash/central blog but it has a code off like =
we talked
about at the meeting last night (Jeff referred to the 10 million word docum=
ent
parser). No real entries in yet but it sounds fun.</span></font></p>
<p class=3DMsoNormal><font size=3D2 face=3DArial><span style=3D'font-size:1=
0.0pt;
font-family:Arial'> </span></font></p>
<p class=3DMsoNormal><font size=3D2 face=3DArial><span style=3D'font-size:1=
0.0pt;
font-family:Arial'><a href=3D"
http://markme.com/mesh/">
http://markme.com/me=
sh/</a>
(</span></font><font size=3D2 face=3DArial><span style=3D'font-size:10.0pt;
font-family:Arial'>Feb 25, 2004</span></font><font size=3D2 face=3DArial><=
span
style=3D'font-size:10.0pt;font-family:Arial'> entry)</span></font></p>
<p style=3D'margin-bottom:12.0pt'><font size=3D2 face=3D"Times New Roman"><=
span
style=3D'font-size:10.0pt'>John C. Bland II</span></font><font size=3D2><s=
pan
style=3D'font-size:10.0pt'><br>
JDEV Inc.<br>
Business: <a href=3D"
http://www.jdevinc.com">
http://www.jdevinc.com</a></sp=
an></font></p>
<p class=3DMsoNormal><font size=3D3 face=3D"Times New Roman"><span style=3D=
'font-size:
12.0pt'> </span></font></p>
</div>
</body>
</html>
------=_NextPart_000_0037_01C3FC35.C7491E70--
Thread at a glance:
Previous Message by Date:
click to view message preview
RE: running exe's
Thx Ray.
John C. Bland II
JDEV Inc.
Business: http://www.jdevinc.com
-----Original Message-----
From: Ray Ragan [mailto:rraga@xxxxxxxxxxxxx]
Sent: Wednesday, February 25, 2004 5:17 PM
To: azcfug-hHKSG33TihhbjbujkaE4pw@xxxxxxxxxxxxxxxx
Subject: Re: [azcfug] running exe's
John,
In a word, no. It is a security measure. You can however set up a
script
page to execute an exe on the server.
Good luck,
Ray
"John C. Bland II" <john@xxxxxxxxxxxxx> on 02/25/2004 03:58:35 PM
Please respond to azcfug-hHKSG33TihhbjbujkaE4pw@xxxxxxxxxxxxxxxx
To: <azcfug-hHKSG33TihhbjbujkaE4pw@xxxxxxxxxxxxxxxx>
cc:
Subject: [azcfug] running exe's
Can I run an executable file in the browser?
John C. Bland II
JDEV Inc.
Business: http://www.jdevinc.com
Yahoo! Groups Sponsor
ADVERTISEMENT
Click HereClick Here
[IMAGE]
Yahoo! Groups Links
To visit your group on the web, go to:
http://groups.yahoo.com/group/azcfug/
To unsubscribe from this group, send an email to:
azcfug-unsubscribe-hHKSG33TihhbjbujkaE4pw@xxxxxxxxxxxxxxxx
Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.
Yahoo! Groups Links
Next Message by Date:
click to view message preview
RE: Parsing HTML
Connie,
You can handle this a couple ways. By starting your search again after the
first instance, or by grabbing the whole code chunk (code between <td class
="blah"> and the second instance of </td>) and parsing the inbetween code
(pull out </td> <td ...>).
Scraping is all about patience.
Hope that helps,
Ray
"Constanty \"Connie\" DeCinko III" <cdecinko@xxxxxxxxxxxxx> on 02/25/2004
09:32:53 PM
Please respond to azcfug-hHKSG33TihhbjbujkaE4pw@xxxxxxxxxxxxxxxx
To: <azcfug-hHKSG33TihhbjbujkaE4pw@xxxxxxxxxxxxxxxx>
cc:
Subject: RE: [azcfug] Parsing HTML
I'm tripped up if the source has line feeds. For instance:
<td class="blah">Mister Jones</td>
<td class="blah">789</td>
I want to search for the name, then grab the number.
Do I need to search twice? Or search for the exact string including line
break and spaces?
-----Original Message-----
From: Ray Ragan [mailto:rraga@xxxxxxxxxxxxx]
Sent: Wednesday, February 25, 2004 5:22 PM
To: azcfug-hHKSG33TihhbjbujkaE4pw@xxxxxxxxxxxxxxxx
Subject: Re: [azcfug] Parsing HTML
Connie,
Just make of the results, no one will notice ;)
I don't have a tutorial, but I can give you a tip. When I'm scraping,you
must identify a unique, static marker or enumerable static markers
proceeding your target in the HTML, the closer to your target, the better.
Then find a static marker to close your string. Then parse through the
data, one entity at a time.
It is extremely tedious, but if your scraping target doesn't change, it's
long lived content.
Good luck,
Ray
"DeCinko, Connie" <cdecinko@xxxxxxxxxxxxx> on 02/25/2004 04:12:59 PM
Please respond to azcfug-hHKSG33TihhbjbujkaE4pw@xxxxxxxxxxxxxxxx
To: <azcfug-hHKSG33TihhbjbujkaE4pw@xxxxxxxxxxxxxxxx>
cc:
Subject: [azcfug] Parsing HTML
I'm looking for a good step-by-step instructions or tutorial on how to
extract data from an HTML file. I need to grab the election results from
the county HTML web page and display them in our page.
Constanty "Connie" DeCinko III
Web Content Administrator
City of Glendale, Arizona
Yahoo! Groups Links
To visit your group on the web, go to:
http://groups.yahoo.com/group/azcfug/
To unsubscribe from this group, send an email to:
azcfug-unsubscribe-hHKSG33TihhbjbujkaE4pw@xxxxxxxxxxxxxxxx
Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.
(See attached file: DeCinko, Connie.vcf)
Yahoo! Groups Links
Yahoo! Groups Sponsor
ADVERTISEMENT
Click HereClick Here
[IMAGE]
Yahoo! Groups Links
To visit your group on the web, go to:
http://groups.yahoo.com/group/azcfug/
To unsubscribe from this group, send an email to:
azcfug-unsubscribe-hHKSG33TihhbjbujkaE4pw@xxxxxxxxxxxxxxxx
Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.
Previous Message by Thread:
click to view message preview
Parsing HTML
------_=_NextPart_001_01C3FBF4.E92E3A3E Content-Type: multipart/alternative;
boundary="----_=_NextPart_002_01C3FBF4.E92E3A3E"
------_=_NextPart_002_01C3FBF4.E92E3A3E Content-Type: text/plain;
charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
I'm looking for a good step-by-step instructions or tutorial on how to
extract data from an HTML file. I need to grab the election results
from the county HTML web page and display them in our page.
=20
=20
=20
Constanty "Connie" DeCinko III
Web Content Administrator
City of Glendale, Arizona
=20
------_=_NextPart_002_01C3FBF4.E92E3A3E Content-Type: text/html;
charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD><TITLE>Message</TITLE>
<META http-equiv=3DContent-Type content=3D"text/html; charset=3Dus-ascii">
<META content=3D"MSHTML 6.00.2800.1400" name=3DGENERATOR></HEAD>
<BODY>
<DIV><SPAN class=3D321311023-25022004><FONT face=3DArial size=3D2>I'm looki=
ng for a=20
good step-by-step instructions or tutorial on how to extract data from an H=
TML=20
file. I need to grab the election results from the county HTML web pa=
ge=20
and display them in our page.</FONT></SPAN></DIV>
<DIV><SPAN class=3D321311023-25022004></SPAN> </DIV>
<DIV> </DIV>
<DIV align=3Dleft><FONT face=3D"Century Gothic" size=3D2></FONT> </DIV=
>
<DIV align=3Dleft><FONT face=3D"Century Gothic" color=3D#0000a0 size=3D2>Co=
nstanty=20
"Connie" DeCinko III</FONT></DIV>
<DIV align=3Dleft><FONT face=3D"Century Gothic" color=3D#0000a0 size=3D2>We=
b Content=20
Administrator</FONT></DIV>
<DIV align=3Dleft><FONT face=3D"Century Gothic" color=3D#0000a0 size=3D2>Ci=
ty of=20
Glendale, Arizona</FONT></DIV>
<DIV> </DIV></BODY></HTML>
------_=_NextPart_002_01C3FBF4.E92E3A3E--
------_=_NextPart_001_01C3FBF4.E92E3A3E Content-Type: text/x-vcard;
name="DeCinko, Connie.vcf"
Content-Transfer-Encoding: base64
Content-Description: DeCinko, Connie.vcf
Content-Disposition: attachment;
filename="DeCinko, Connie.vcf"
QkVHSU46VkNBUkQNClZFUlNJT046Mi4xDQpOOkRlQ2lua287Q29ubmllDQpGTjpEZUNpbmtvLCBD
b25uaWUNCk9SRzo7SW5mb3JtYXRpb24gVGVjaG5vbG9neQ0KVElUTEU6RWxlY3Ryb25pYyBJbmZv
cm1hdGlvbiBTcGNsc3QNClRFTDtXT1JLO1ZPSUNFOjYyMy85MzAtMjk1OA0KRU1BSUw7UFJFRjtJ
TlRFUk5FVDp3ZWJtYXN0ZXJAZ2xlbmRhbGUtZ2F0ZXdheS5jb20NClJFVjoyMDA0MDIxN1QxNTIx
NDZaDQpFTkQ6VkNBUkQNCg==
------_=_NextPart_001_01C3FBF4.E92E3A3E--
Next Message by Thread:
click to view message preview
killer freeware tools
after last night's meeting, i mentioned a thread on crystaltech's
forums that talks about freeware apps and utilities that you "just
can't be without".
for those interested, http://www.crystaltech.com/forum/topic.asp?
TOPIC_ID=1955&FORUM_ID=1&CAT_ID=1&Topic_Title=Free+%
26+Useful+Tools&Forum_Title=General+Discussion+Forum
(sorry 'bout the big ol' ugly URL)
enjoy :)
Charlie