Python

TheCatt
Site Admin
Posts: 53716
Joined: Thu May 20, 2004 11:15 pm
Location: Cary, NC

Python

Post by TheCatt »

Talking to Citadel tomorrow. They want python.
It's not me, it's someone else.
Malcolm
Posts: 32040
Joined: Fri May 21, 2004 1:04 pm
Location: Minneapolis

Python

Post by Malcolm »

It's easy to pick up.
Diogenes of Sinope: "It is not that I am mad, it is only that my head is different from yours."
Arnold Judas Rimmer, BSC, SSC: "Better dead than smeg."
TheCatt
Site Admin
Posts: 53716
Joined: Thu May 20, 2004 11:15 pm
Location: Cary, NC

Python

Post by TheCatt »

Yeah, I've supported it before. Honestly, all I really want from Citadel is an offer so my current company knows they need to keep up :)
It's not me, it's someone else.
User avatar
Troy
Posts: 7156
Joined: Mon Jun 07, 2004 8:00 am

Python

Post by Troy »

Back off the road. Looking for something to do at my cube other than reddit.

Image

This sucker called out to me. First couple chapters have been straightforward enough, still haven't moved passed what I'd already gotten from Python Meetups and Conferences.

I just need to fully dust off my Python skills, figure out the new libraries, and then find a worthy dataset.
Malcolm
Posts: 32040
Joined: Fri May 21, 2004 1:04 pm
Location: Minneapolis

Python

Post by Malcolm »

Machine learning? Yawn. You will come to hate the phrases "training data" and "vector machine."
Diogenes of Sinope: "It is not that I am mad, it is only that my head is different from yours."
Arnold Judas Rimmer, BSC, SSC: "Better dead than smeg."
TheCatt
Site Admin
Posts: 53716
Joined: Thu May 20, 2004 11:15 pm
Location: Cary, NC

Python

Post by TheCatt »

Malcolm wrote: Machine learning? Yawn. You will come to hate the phrases "training data" and "vector machine."
And the $200k offers.
It's not me, it's someone else.
Malcolm
Posts: 32040
Joined: Fri May 21, 2004 1:04 pm
Location: Minneapolis

Python

Post by Malcolm »

TheCatt wrote:
Malcolm wrote: Machine learning? Yawn. You will come to hate the phrases "training data" and "vector machine."
And the $200k offers.
In Frisco dollars?
Diogenes of Sinope: "It is not that I am mad, it is only that my head is different from yours."
Arnold Judas Rimmer, BSC, SSC: "Better dead than smeg."
User avatar
Troy
Posts: 7156
Joined: Mon Jun 07, 2004 8:00 am

Python

Post by Troy »

I wrote my first real Python program for professional use today. Just a simple script validating the # of exhibits and page numbers(and therefore individual TIFFS) in a trial database.

The program our techs were using took hours to run.

Mine takes a few minutes! (a few hours to write)

6k documents, 480k TIFFs.
Malcolm
Posts: 32040
Joined: Fri May 21, 2004 1:04 pm
Location: Minneapolis

Python

Post by Malcolm »

I should've subcontracted out the shitty custom file compare job I had to you.
Diogenes of Sinope: "It is not that I am mad, it is only that my head is different from yours."
Arnold Judas Rimmer, BSC, SSC: "Better dead than smeg."
TheCatt
Site Admin
Posts: 53716
Joined: Thu May 20, 2004 11:15 pm
Location: Cary, NC

Python

Post by TheCatt »

I wrote my first python script. It logs into a client-certificate secured website, and then gets a list of available files. I then hard-coded one file to download.

Need to re-write it so that it will parse the list of files, and then download all of them.

But not bad for 60 minutes or work... which was really a lot of googling.
It's not me, it's someone else.
TheCatt
Site Admin
Posts: 53716
Joined: Thu May 20, 2004 11:15 pm
Location: Cary, NC

Python

Post by TheCatt »

Added that in less than 20 minutes, saved all the files out as well. woot woot
It's not me, it's someone else.
TheCatt
Site Admin
Posts: 53716
Joined: Thu May 20, 2004 11:15 pm
Location: Cary, NC

Python

Post by TheCatt »

I did a little more python, got more familiar with scripts, and some of the (many, many) modules out there.

I was asked by a energy trading firm to write a program in python as part of a job test/recruitment thing. I sent them a version today... and have to fly up to Cambridge, MA to see about an interview later this month
It's not me, it's someone else.
User avatar
Troy
Posts: 7156
Joined: Mon Jun 07, 2004 8:00 am

Python

Post by Troy »

TheCatt wrote: I did a little more python, got more familiar with scripts, and some of the (many, many) modules out there.

I was asked by a energy trading firm to write a program in python as part of a job test/recruitment thing. I sent them a version today... and have to fly up to Cambridge, MA to see about an interview later this month
Killer! That sounds awesome. Are you adding it to your contracting side-hustle or thinking about switching?

Can you tell more about the program they asked you to write as a test?
Last edited by Troy on Sat Mar 31, 2018 1:51 pm, edited 1 time in total.
TheCatt
Site Admin
Posts: 53716
Joined: Thu May 20, 2004 11:15 pm
Location: Cary, NC

Python

Post by TheCatt »

Troy wrote: Are you adding it to your contracting side-hussle or thinking about switching?
They are part of my existing side-hustle. They wanted to hire me 3 years ago, but I refused to move to Boston. They hired me to write a trading system for them, which I did. They also hired someone else to be their day-to-day guy. That guy quit, so they're looking. Since they've worked with me for 3 years now, they are open to me working remote, and visiting 1 week/month. So this would be switching.
Troy wrote: Can you tell more about the program they asked you to write as a test?
They are trying to collect data that's in an XML format from a certain publisher. The files are published as links on an HTML page. There are two types of files, one is published every 15 minutes, the other published every 60 minutes.

Program 1 (one for 15 minute data, one for 60 minute data)
1 - Turns out the requests library is very handy for downloading HTML.
2 - To identify all links, I just used xpath against the HTML to identify all a/href tags.
3 - I then iterate over the collection, and verify that each link is a file I want (some links are irrelevant)
4 - If it's relevant, I extract the filename, and use scandir to look for that in history. If it doesn't exit, I download it.

Program 2 (one for 15 minute, one for 60 minute)
1 - Takes a given XML file.
2 - Looks for the relevant elements in each XML node that I care about (about 1000 for 1 file, 11000 for the 2nd file)
3 - I load these into a tuple.
4 - I load the tuples into an array.
5 - Every 1,000 rows, I send the data to the DB using fast_executemany (orders of magnitude faster than single-line inserts)

There's a lot to do in terms of error-handling, email notifications, robustness, etc. but this was basically a test to see if I could do one of their main needs.
It's not me, it's someone else.
TheCatt
Site Admin
Posts: 53716
Joined: Thu May 20, 2004 11:15 pm
Location: Cary, NC

Python

Post by TheCatt »

I'm going up there tomorrow. They also want me to present about AWS, and what will make their life easier in AWS. They tend to use a lot of python, Excel, client-server style apps. So that could be a bit trickier.
It's not me, it's someone else.
thibodeaux
Posts: 8055
Joined: Thu May 20, 2004 7:32 pm

Python

Post by thibodeaux »

Glue or Batch? Spark on EMR? I dunno.
TheCatt
Site Admin
Posts: 53716
Joined: Thu May 20, 2004 11:15 pm
Location: Cary, NC

Python

Post by TheCatt »

thibodeaux wrote: Glue or Batch? Spark on EMR? I dunno.
Glue is terrible. Do you actually use it? I had a half day with our AWS Account reps, and I told them if they mentioned Glue they had to buy lunch. Of course, we dialed in some AWS experts later, and the first one mentioned Glue, so we get 'free lunch'

They only use S3 (and only a little), EC2, and RDS.

So the ones I think would be relevant to them, or could be:
Lambda – Run code for short periods of time, serverlessly.
Scheduled operations
Triggered operations (new file in S3)
S3 – Storage (S3IA, Glacier)
RDS – Aurora or open-source engines ($) – Serverless (auto-scaling RDBMS)
Boto3 – AWS API via python
Neptune – Graph DB representation of power networks
RedShift – OLAP DB for analytical workloads.
EMR – Spark?
Step Functions – (State machines)
SageMaker – Guided/Automated Machine Learning.
It's not me, it's someone else.
thibodeaux
Posts: 8055
Joined: Thu May 20, 2004 7:32 pm

Python

Post by thibodeaux »

I have not used Glue. I was maybe also going to recommend Sagemaker since it'll do some management of training jobs and also stand up a REST endpoint but it's pretty specialized unless you want to roll your own docker image.
TheCatt
Site Admin
Posts: 53716
Joined: Thu May 20, 2004 11:15 pm
Location: Cary, NC

Python

Post by TheCatt »

thibodeaux wrote: I have not used Glue. I was maybe also going to recommend Sagemaker since it'll do some management of training jobs and also stand up a REST endpoint but it's pretty specialized unless you want to roll your own docker image.
Have you used Sagemaker? The docs promise to bring ML to people who don't know ML. Truth? I haven't used it at all yet.
It's not me, it's someone else.
thibodeaux
Posts: 8055
Joined: Thu May 20, 2004 7:32 pm

Python

Post by thibodeaux »

I don’t think it’s gonna do that. I’ve done some tutorials and read some docs. It’s basically a docker container that will run a Jupiter notebook and also launch training jobs and host a web server.

It might be more accurate to say it manages some of the plumbing of ML, so that if you do understand ML you can actually do ML instead of the plumbing which is normally about 90% of what you have to do to get ML in production. And not even sure how good a job it does of that.
Post Reply