moderately simple macroing

Malcolm
Posts: 32040
Joined: Fri May 21, 2004 1:04 pm
Location: Minneapolis

Post by Malcolm »

Alright, I need to duplicate what I think is the real DB of info for our hosted sites. Unfortunately, I can't just write a convenient little script to grab that info due to various things. I just need a small app that can grab a list of webpages (following a regular pattern) & save the HTML source to a .txt file (for later parsing). Anyone got any favourite macro programs they use?
Diogenes of Sinope: "It is not that I am mad, it is only that my head is different from yours."
Arnold Judas Rimmer, BSC, SSC: "Better dead than smeg."
Malcolm
Posts: 32040
Joined: Fri May 21, 2004 1:04 pm
Location: Minneapolis

Post by Malcolm »

Fuck it, finally broke down & got the latest version of EZ Macros.
Diogenes of Sinope: "It is not that I am mad, it is only that my head is different from yours."
Arnold Judas Rimmer, BSC, SSC: "Better dead than smeg."
TheCatt
Site Admin
Posts: 57681
Joined: Thu May 20, 2004 11:15 pm
Location: Cary, NC

Post by TheCatt »

Curl. Possibly curl inside of a VBScript/perl/whatever.
It's not me, it's someone else.
GORDON
Site Admin
Posts: 56735
Joined: Sun Jun 06, 2004 10:43 pm
Location: DTManistan
Contact:

Post by GORDON »

Malcolm wrote:Fuck it, finally broke down & got the latest version of EZ Macros.
I know where you can get a version of EZ Macros from 1998 where you get the full version just by installing and changing the executable from "ezeval" to "ezmacros."
"Be bold, and mighty forces will come to your aid."
TPRJones
Posts: 13418
Joined: Fri May 21, 2004 2:05 pm
Location: Houston
Contact:

Post by TPRJones »

There's probably some way to do this with visual basic, but I don't know how to get it to hit web addresses.

If EZMacros doesn't work, let me know and I'll hack at it one night with VB.

EDIT: Okay, the Open method hits web addresses just fine, so yeah, VB could take a list of web addresses and cycle through them opening them and saving to the local drive. Not sure just how to convert to source text on the way but I'm sure it can probably be done.




Edited By TPRJones on 1193075170
"ATTENTION: Customers browsing porn must hold magazines with both hands at all times!"
TPRJones
Posts: 13418
Joined: Fri May 21, 2004 2:05 pm
Location: Houston
Contact:

Post by TPRJones »

This doesn't work. It will pull up and save files just fine, but it doesn't get at the source code. I can't seem to figure out how to make that part work. Yet. But to yank the visible file contents, this will work fine.

Make a sheet in Excel with column A being complete web addresses (without http:// at the front, although if you want to add that you can, just take it out below), and column B is where to save the files (full drive an dpath, too, although as with the web addresses you can add the path into the script below and not have ot add it to every line of hte sheet). Make a copy of your sheet (the macro below will churn through it and delete the contents). Execute the following code on the copy:

Code: Select all

   Do While ActiveSheet.Range("A1").Value <> ""
        Workbooks.Open Filename:="http://" + ActiveSheet.Range("A1").Value
        ActiveWorkbook.SaveAs Filename:=ActiveSheet.Range("B1").Value, FileFormat:=xlUnicodeText, CreateBackup:=False
        ActiveWindow.Close
        Rows("1:1").Select
        Selection.Delete Shift:=xlUp
        Range("A1").Select
        Loop
The source code eludes me so far. I will find a way, though.[/color]
"ATTENTION: Customers browsing porn must hold magazines with both hands at all times!"
Malcolm
Posts: 32040
Joined: Fri May 21, 2004 1:04 pm
Location: Minneapolis

Post by Malcolm »

You take way too much pleasure in these technical intricacies.
Diogenes of Sinope: "It is not that I am mad, it is only that my head is different from yours."
Arnold Judas Rimmer, BSC, SSC: "Better dead than smeg."
TheCatt
Site Admin
Posts: 57681
Joined: Thu May 20, 2004 11:15 pm
Location: Cary, NC

Post by TheCatt »

Anyone with BALLS would have used curl.
It's not me, it's someone else.
TPRJones
Posts: 13418
Joined: Fri May 21, 2004 2:05 pm
Location: Houston
Contact:

Post by TPRJones »

Shush, you. This is what I do for fun.

I don't have FrontPage, and that's probably where the library with the command I'd need to make this work is hiding. If there's a way to view HTML source code in Word or Excel or Publisher, I sure can't find it.

I like VB beause it's on almost everyone's computer these days making it the most widely installed compiler I'm aware of. But it's pretty heavily tied into the Office suite, and that can be a bit annoying when you are trying to do something they weren't really meant to do.
"ATTENTION: Customers browsing porn must hold magazines with both hands at all times!"
Malcolm
Posts: 32040
Joined: Fri May 21, 2004 1:04 pm
Location: Minneapolis

Post by Malcolm »

Well, hell, the other technical issue you might take orgasmic pleasure in is this ...

I log into the admin panel for a site going thru some weird login page using something that's obviously not a standard HTML form, but when you look at the source, you can clearly make out where the username & password go. All that said, I can't figure out how to make it submit the form w\ my desired credentials, so I can then go in & browse the admin panel (where all the info I want is).

Normally I just pick thru webpages by grabbing them in .txt format w\ Python & parsing them later at my leisure. However, I can't figure out a way to log in to the site from the Python code. Which means I can't get whatever info is used to verify that the connection is secure, which itself means I never get to see the HTML for the admin panel.

So, I did some digging & found ways to feed cookies into Python code (I guessed they were using a cookie -- stupid me). It appears to be some kind of mutant ASP session id that I cannot, in any way, shape, or form, even find in the browser's cache in any usable format.

Then I thought I could hack my way in w\ Javascript & just steal things using getElementById. I do not have permission to grab elements by id. Fucking unreal. If they went to as much trouble designing their interfaces as they do their security measures (keeping me from MY OWN DATA nonetheless), the end result might not suck ass so hard. Fuck you, Monster Commerce.
Diogenes of Sinope: "It is not that I am mad, it is only that my head is different from yours."
Arnold Judas Rimmer, BSC, SSC: "Better dead than smeg."
TPRJones
Posts: 13418
Joined: Fri May 21, 2004 2:05 pm
Location: Houston
Contact:

Post by TPRJones »

Ah, well in that case even if I could fix the view-source problem, that bit of code wouldn't help you any.

That does sound ugly. If you can just get logged in, EZMacros sounds like the way to go.




Edited By TPRJones on 1193083114
"ATTENTION: Customers browsing porn must hold magazines with both hands at all times!"
TheCatt
Site Admin
Posts: 57681
Joined: Thu May 20, 2004 11:15 pm
Location: Cary, NC

Post by TheCatt »

CURL does logins.
It's not me, it's someone else.
Malcolm
Posts: 32040
Joined: Fri May 21, 2004 1:04 pm
Location: Minneapolis

Post by Malcolm »

& it can save the page source for each of the ~1000 pages I need?
Diogenes of Sinope: "It is not that I am mad, it is only that my head is different from yours."
Arnold Judas Rimmer, BSC, SSC: "Better dead than smeg."
TheCatt
Site Admin
Posts: 57681
Joined: Thu May 20, 2004 11:15 pm
Location: Cary, NC

Post by TheCatt »

Yes. IT CAN DO EVERYTHING.

I know we've covered this before.
It's not me, it's someone else.
TheCatt
Site Admin
Posts: 57681
Joined: Thu May 20, 2004 11:15 pm
Location: Cary, NC

Post by TheCatt »

Yeah, we did.

It does HTTP and FTP automation like nothing else. You may need to have a scripting language around it to variablize things, but it can do everything you've mentioned.
It's not me, it's someone else.
Malcolm
Posts: 32040
Joined: Fri May 21, 2004 1:04 pm
Location: Minneapolis

Post by Malcolm »

Alright, for Christ's sake. Is this something that I can hit the command line w\ if need be? Like if things really suck & I need to break out the fucking DOS script?
Diogenes of Sinope: "It is not that I am mad, it is only that my head is different from yours."
Arnold Judas Rimmer, BSC, SSC: "Better dead than smeg."
TheCatt
Site Admin
Posts: 57681
Joined: Thu May 20, 2004 11:15 pm
Location: Cary, NC

Post by TheCatt »

Curl lives for the command line.
It's not me, it's someone else.
Malcolm
Posts: 32040
Joined: Fri May 21, 2004 1:04 pm
Location: Minneapolis

Post by Malcolm »

Damn, they have made that app into something of a sledgehammer, haven't they?
Diogenes of Sinope: "It is not that I am mad, it is only that my head is different from yours."
Arnold Judas Rimmer, BSC, SSC: "Better dead than smeg."
Malcolm
Posts: 32040
Joined: Fri May 21, 2004 1:04 pm
Location: Minneapolis

Post by Malcolm »

Alright, it's time to apply the sledgehammer since all other less brutal options have seemingly vanished. A thing that worries me, in attempting to raid the info w\ Python, there's apparently a way to login, but for one transaction request only. In other words, Python doesn't seem to remember whatever validation the other site wants. Normally, I'd just chalk this up to something fucked up w\ Python, but these fuckers have proven unusually resilient w\ their defenses.
Diogenes of Sinope: "It is not that I am mad, it is only that my head is different from yours."
Arnold Judas Rimmer, BSC, SSC: "Better dead than smeg."
TheCatt
Site Admin
Posts: 57681
Joined: Thu May 20, 2004 11:15 pm
Location: Cary, NC

Post by TheCatt »

Something like... a cookie? trap the cookies with curl.
It's not me, it's someone else.
Post Reply