moderately simple macroing
Alright, I need to duplicate what I think is the real DB of info for our hosted sites. Unfortunately, I can't just write a convenient little script to grab that info due to various things. I just need a small app that can grab a list of webpages (following a regular pattern) & save the HTML source to a .txt file (for later parsing). Anyone got any favourite macro programs they use?
Diogenes of Sinope: "It is not that I am mad, it is only that my head is different from yours."
Arnold Judas Rimmer, BSC, SSC: "Better dead than smeg."
Arnold Judas Rimmer, BSC, SSC: "Better dead than smeg."
There's probably some way to do this with visual basic, but I don't know how to get it to hit web addresses.
If EZMacros doesn't work, let me know and I'll hack at it one night with VB.
EDIT: Okay, the Open method hits web addresses just fine, so yeah, VB could take a list of web addresses and cycle through them opening them and saving to the local drive. Not sure just how to convert to source text on the way but I'm sure it can probably be done.
Edited By TPRJones on 1193075170
If EZMacros doesn't work, let me know and I'll hack at it one night with VB.
EDIT: Okay, the Open method hits web addresses just fine, so yeah, VB could take a list of web addresses and cycle through them opening them and saving to the local drive. Not sure just how to convert to source text on the way but I'm sure it can probably be done.
Edited By TPRJones on 1193075170
"ATTENTION: Customers browsing porn must hold magazines with both hands at all times!"
This doesn't work. It will pull up and save files just fine, but it doesn't get at the source code. I can't seem to figure out how to make that part work. Yet. But to yank the visible file contents, this will work fine.
Make a sheet in Excel with column A being complete web addresses (without http:// at the front, although if you want to add that you can, just take it out below), and column B is where to save the files (full drive an dpath, too, although as with the web addresses you can add the path into the script below and not have ot add it to every line of hte sheet). Make a copy of your sheet (the macro below will churn through it and delete the contents). Execute the following code on the copy:
The source code eludes me so far. I will find a way, though.[/color]
Make a sheet in Excel with column A being complete web addresses (without http:// at the front, although if you want to add that you can, just take it out below), and column B is where to save the files (full drive an dpath, too, although as with the web addresses you can add the path into the script below and not have ot add it to every line of hte sheet). Make a copy of your sheet (the macro below will churn through it and delete the contents). Execute the following code on the copy:
Code: Select all
Do While ActiveSheet.Range("A1").Value <> ""
Workbooks.Open Filename:="http://" + ActiveSheet.Range("A1").Value
ActiveWorkbook.SaveAs Filename:=ActiveSheet.Range("B1").Value, FileFormat:=xlUnicodeText, CreateBackup:=False
ActiveWindow.Close
Rows("1:1").Select
Selection.Delete Shift:=xlUp
Range("A1").Select
Loop
"ATTENTION: Customers browsing porn must hold magazines with both hands at all times!"
Shush, you. This is what I do for fun.
I don't have FrontPage, and that's probably where the library with the command I'd need to make this work is hiding. If there's a way to view HTML source code in Word or Excel or Publisher, I sure can't find it.
I like VB beause it's on almost everyone's computer these days making it the most widely installed compiler I'm aware of. But it's pretty heavily tied into the Office suite, and that can be a bit annoying when you are trying to do something they weren't really meant to do.
I don't have FrontPage, and that's probably where the library with the command I'd need to make this work is hiding. If there's a way to view HTML source code in Word or Excel or Publisher, I sure can't find it.
I like VB beause it's on almost everyone's computer these days making it the most widely installed compiler I'm aware of. But it's pretty heavily tied into the Office suite, and that can be a bit annoying when you are trying to do something they weren't really meant to do.
"ATTENTION: Customers browsing porn must hold magazines with both hands at all times!"
Well, hell, the other technical issue you might take orgasmic pleasure in is this ...
I log into the admin panel for a site going thru some weird login page using something that's obviously not a standard HTML form, but when you look at the source, you can clearly make out where the username & password go. All that said, I can't figure out how to make it submit the form w\ my desired credentials, so I can then go in & browse the admin panel (where all the info I want is).
Normally I just pick thru webpages by grabbing them in .txt format w\ Python & parsing them later at my leisure. However, I can't figure out a way to log in to the site from the Python code. Which means I can't get whatever info is used to verify that the connection is secure, which itself means I never get to see the HTML for the admin panel.
So, I did some digging & found ways to feed cookies into Python code (I guessed they were using a cookie -- stupid me). It appears to be some kind of mutant ASP session id that I cannot, in any way, shape, or form, even find in the browser's cache in any usable format.
Then I thought I could hack my way in w\ Javascript & just steal things using getElementById. I do not have permission to grab elements by id. Fucking unreal. If they went to as much trouble designing their interfaces as they do their security measures (keeping me from MY OWN DATA nonetheless), the end result might not suck ass so hard. Fuck you, Monster Commerce.
I log into the admin panel for a site going thru some weird login page using something that's obviously not a standard HTML form, but when you look at the source, you can clearly make out where the username & password go. All that said, I can't figure out how to make it submit the form w\ my desired credentials, so I can then go in & browse the admin panel (where all the info I want is).
Normally I just pick thru webpages by grabbing them in .txt format w\ Python & parsing them later at my leisure. However, I can't figure out a way to log in to the site from the Python code. Which means I can't get whatever info is used to verify that the connection is secure, which itself means I never get to see the HTML for the admin panel.
So, I did some digging & found ways to feed cookies into Python code (I guessed they were using a cookie -- stupid me). It appears to be some kind of mutant ASP session id that I cannot, in any way, shape, or form, even find in the browser's cache in any usable format.
Then I thought I could hack my way in w\ Javascript & just steal things using getElementById. I do not have permission to grab elements by id. Fucking unreal. If they went to as much trouble designing their interfaces as they do their security measures (keeping me from MY OWN DATA nonetheless), the end result might not suck ass so hard. Fuck you, Monster Commerce.
Diogenes of Sinope: "It is not that I am mad, it is only that my head is different from yours."
Arnold Judas Rimmer, BSC, SSC: "Better dead than smeg."
Arnold Judas Rimmer, BSC, SSC: "Better dead than smeg."
Ah, well in that case even if I could fix the view-source problem, that bit of code wouldn't help you any.
That does sound ugly. If you can just get logged in, EZMacros sounds like the way to go.
Edited By TPRJones on 1193083114
That does sound ugly. If you can just get logged in, EZMacros sounds like the way to go.
Edited By TPRJones on 1193083114
"ATTENTION: Customers browsing porn must hold magazines with both hands at all times!"
Yeah, we did.
It does HTTP and FTP automation like nothing else. You may need to have a scripting language around it to variablize things, but it can do everything you've mentioned.
It does HTTP and FTP automation like nothing else. You may need to have a scripting language around it to variablize things, but it can do everything you've mentioned.
It's not me, it's someone else.
Alright, for Christ's sake. Is this something that I can hit the command line w\ if need be? Like if things really suck & I need to break out the fucking DOS script?
Diogenes of Sinope: "It is not that I am mad, it is only that my head is different from yours."
Arnold Judas Rimmer, BSC, SSC: "Better dead than smeg."
Arnold Judas Rimmer, BSC, SSC: "Better dead than smeg."
Alright, it's time to apply the sledgehammer since all other less brutal options have seemingly vanished. A thing that worries me, in attempting to raid the info w\ Python, there's apparently a way to login, but for one transaction request only. In other words, Python doesn't seem to remember whatever validation the other site wants. Normally, I'd just chalk this up to something fucked up w\ Python, but these fuckers have proven unusually resilient w\ their defenses.
Diogenes of Sinope: "It is not that I am mad, it is only that my head is different from yours."
Arnold Judas Rimmer, BSC, SSC: "Better dead than smeg."
Arnold Judas Rimmer, BSC, SSC: "Better dead than smeg."