Page 5 of 8

Python

Posted: Wed Feb 22, 2017 8:39 pm
by TheCatt
Talking to Citadel tomorrow. They want python.

Python

Posted: Wed Feb 22, 2017 9:02 pm
by Malcolm
It's easy to pick up.

Python

Posted: Thu Feb 23, 2017 11:51 am
by TheCatt
Yeah, I've supported it before. Honestly, all I really want from Citadel is an offer so my current company knows they need to keep up :)

Python

Posted: Thu May 04, 2017 2:04 pm
by Troy
Back off the road. Looking for something to do at my cube other than reddit.

Image

This sucker called out to me. First couple chapters have been straightforward enough, still haven't moved passed what I'd already gotten from Python Meetups and Conferences.

I just need to fully dust off my Python skills, figure out the new libraries, and then find a worthy dataset.

Python

Posted: Fri May 05, 2017 2:06 pm
by Malcolm
Machine learning? Yawn. You will come to hate the phrases "training data" and "vector machine."

Python

Posted: Sat May 06, 2017 9:33 pm
by TheCatt
Malcolm wrote: Machine learning? Yawn. You will come to hate the phrases "training data" and "vector machine."
And the $200k offers.

Python

Posted: Sun May 07, 2017 8:48 pm
by Malcolm
TheCatt wrote:
Malcolm wrote: Machine learning? Yawn. You will come to hate the phrases "training data" and "vector machine."
And the $200k offers.
In Frisco dollars?

Python

Posted: Thu Jun 01, 2017 6:10 pm
by Troy
I wrote my first real Python program for professional use today. Just a simple script validating the # of exhibits and page numbers(and therefore individual TIFFS) in a trial database.

The program our techs were using took hours to run.

Mine takes a few minutes! (a few hours to write)

6k documents, 480k TIFFs.

Python

Posted: Sun Jun 04, 2017 11:11 pm
by Malcolm
I should've subcontracted out the shitty custom file compare job I had to you.

Python

Posted: Tue Mar 27, 2018 8:09 pm
by TheCatt
I wrote my first python script. It logs into a client-certificate secured website, and then gets a list of available files. I then hard-coded one file to download.

Need to re-write it so that it will parse the list of files, and then download all of them.

But not bad for 60 minutes or work... which was really a lot of googling.

Python

Posted: Tue Mar 27, 2018 8:29 pm
by TheCatt
Added that in less than 20 minutes, saved all the files out as well. woot woot

Python

Posted: Sat Mar 31, 2018 12:39 am
by TheCatt
I did a little more python, got more familiar with scripts, and some of the (many, many) modules out there.

I was asked by a energy trading firm to write a program in python as part of a job test/recruitment thing. I sent them a version today... and have to fly up to Cambridge, MA to see about an interview later this month

Python

Posted: Sat Mar 31, 2018 2:12 am
by Troy
TheCatt wrote: I did a little more python, got more familiar with scripts, and some of the (many, many) modules out there.

I was asked by a energy trading firm to write a program in python as part of a job test/recruitment thing. I sent them a version today... and have to fly up to Cambridge, MA to see about an interview later this month
Killer! That sounds awesome. Are you adding it to your contracting side-hustle or thinking about switching?

Can you tell more about the program they asked you to write as a test?

Python

Posted: Sat Mar 31, 2018 9:20 am
by TheCatt
Troy wrote: Are you adding it to your contracting side-hussle or thinking about switching?
They are part of my existing side-hustle. They wanted to hire me 3 years ago, but I refused to move to Boston. They hired me to write a trading system for them, which I did. They also hired someone else to be their day-to-day guy. That guy quit, so they're looking. Since they've worked with me for 3 years now, they are open to me working remote, and visiting 1 week/month. So this would be switching.
Troy wrote: Can you tell more about the program they asked you to write as a test?
They are trying to collect data that's in an XML format from a certain publisher. The files are published as links on an HTML page. There are two types of files, one is published every 15 minutes, the other published every 60 minutes.

Program 1 (one for 15 minute data, one for 60 minute data)
1 - Turns out the requests library is very handy for downloading HTML.
2 - To identify all links, I just used xpath against the HTML to identify all a/href tags.
3 - I then iterate over the collection, and verify that each link is a file I want (some links are irrelevant)
4 - If it's relevant, I extract the filename, and use scandir to look for that in history. If it doesn't exit, I download it.

Program 2 (one for 15 minute, one for 60 minute)
1 - Takes a given XML file.
2 - Looks for the relevant elements in each XML node that I care about (about 1000 for 1 file, 11000 for the 2nd file)
3 - I load these into a tuple.
4 - I load the tuples into an array.
5 - Every 1,000 rows, I send the data to the DB using fast_executemany (orders of magnitude faster than single-line inserts)

There's a lot to do in terms of error-handling, email notifications, robustness, etc. but this was basically a test to see if I could do one of their main needs.

Python

Posted: Mon Apr 16, 2018 5:15 pm
by TheCatt
I'm going up there tomorrow. They also want me to present about AWS, and what will make their life easier in AWS. They tend to use a lot of python, Excel, client-server style apps. So that could be a bit trickier.

Python

Posted: Mon Apr 16, 2018 6:31 pm
by thibodeaux
Glue or Batch? Spark on EMR? I dunno.

Python

Posted: Mon Apr 16, 2018 6:41 pm
by TheCatt
thibodeaux wrote: Glue or Batch? Spark on EMR? I dunno.
Glue is terrible. Do you actually use it? I had a half day with our AWS Account reps, and I told them if they mentioned Glue they had to buy lunch. Of course, we dialed in some AWS experts later, and the first one mentioned Glue, so we get 'free lunch'

They only use S3 (and only a little), EC2, and RDS.

So the ones I think would be relevant to them, or could be:
Lambda – Run code for short periods of time, serverlessly.
Scheduled operations
Triggered operations (new file in S3)
S3 – Storage (S3IA, Glacier)
RDS – Aurora or open-source engines ($) – Serverless (auto-scaling RDBMS)
Boto3 – AWS API via python
Neptune – Graph DB representation of power networks
RedShift – OLAP DB for analytical workloads.
EMR – Spark?
Step Functions – (State machines)
SageMaker – Guided/Automated Machine Learning.

Python

Posted: Mon Apr 16, 2018 8:13 pm
by thibodeaux
I have not used Glue. I was maybe also going to recommend Sagemaker since it'll do some management of training jobs and also stand up a REST endpoint but it's pretty specialized unless you want to roll your own docker image.

Python

Posted: Mon Apr 16, 2018 8:32 pm
by TheCatt
thibodeaux wrote: I have not used Glue. I was maybe also going to recommend Sagemaker since it'll do some management of training jobs and also stand up a REST endpoint but it's pretty specialized unless you want to roll your own docker image.
Have you used Sagemaker? The docs promise to bring ML to people who don't know ML. Truth? I haven't used it at all yet.

Python

Posted: Mon Apr 16, 2018 8:37 pm
by thibodeaux
I don’t think it’s gonna do that. I’ve done some tutorials and read some docs. It’s basically a docker container that will run a Jupiter notebook and also launch training jobs and host a web server.

It might be more accurate to say it manages some of the plumbing of ML, so that if you do understand ML you can actually do ML instead of the plumbing which is normally about 90% of what you have to do to get ML in production. And not even sure how good a job it does of that.