Saturday, May 23, 2015

Multi-threading and Web Requests


I'm writing a tool that needs to make over 200 API calls using the requests library in python.  The function is doing little more than asking for information and then appending that information into a list.  The current run time results look like this:
[Finished in 209.3s]
I think I can do better.   The main delay is caused because each request must be made after the previous request has finished (or in 'consecutive' order).  This happens because a single python script runs as a single thread, processing things consecutively.

Running multiple threads would resolve this problem.A quick search for python multithreading points me to the Pool class in the multithreading library. Using a pool of "workers" (threads) I can run each web request concurrently.

My first attempt at implementing a pool results in the following error:
Can't pickle
Stack Overflow had an answer pointing out that in order to pass the work around to multiple threads, the job for each thread needed to be converted into a standard format (serialized or 'pickled').  the particular serializer used by this library dosen't understand how to serialize an instanced method, so we'll help it out:
Placing the above script in my class means that it will add understanding for instanced method pickling.  

The next hurtle was the need to pass multiple values into a function called by a Pool instance.  While there seem to be many suggested ways of doing this, I found building a helper function to be easiest.  


This helper function takes the a tuple of arguments handed to it, and then calls the "real" function passing the tuple values in.  Putting everything together looks something like this:


In the end pool.map is handed the helper function to call, and a payload.  The payload consists of a list of tuples.  Each tuple contains the arguments needed for 1 call of the function doing the work (sometimes called the 'Critical Section').  The end result:
[Finished in 16.3s]
Running with 32 processes, the task took only 16.2 seconds.  I'd say that's a solid improvement.

Saturday, May 16, 2015

Student Debt is America's Giving Tree



I'm not an accountant.  I'm not a bureaucrat. I am a 31 year old with a mountain of student loan debt.  Understanding student debt requires bureaucratic and accounting skills (neither of which were part of my college education.)  The real problem of student debt is the problem of 'fake money'.

To understand fake money, lets say two people decide to purchase an apple tree sapling.  Lets call them Bob and Sue.  Bob and Sue each pay $1 for 50% ownership of the sapling.  Bob and Sue are investors, so they're really in this venture for apples.  They agree that every year, they will each take 10 apples and leave the remaining apples on the tree, allowing the tree to reproduce, improve society, and other things that trees with apples do.


The rub of this problem occurs in year 1 when the sapling has yet to yield any fruit.  Bob and Sue arn't sure what to make of this, as now neither of them have yielded their annual apples.  They agree that they'll just each take 21 apples next year (with an additional apple for interest).  Unfortunately 2 years in the apple sapling is still not yet a fruit bearing tree, and this problem persists until the apples + interest owed Bob and Sue is greater than the tree could ever yield in 1 year.  Without apples to spread seed, the tree will never produce, local wild life will not be able to glean fruit from the tree and small ecosystem to which the tree belongs is all the poorer.


Bob and Sue generated this problem by relying on 'Fake Apples'.  Just because they agree to a certain yield per year does not mean that yield will occur.  Counting missed yields as a debt (and worse, a debt with interest) only compounds the issue.  The tree is depraved basic fruit bearing freedom, and Bob and Sue come across as greedy uncaring apple investors.

Rather than 'Fake Apples' Bob and Sue could have agreed to a percentage of apples each year.  This agreement eliminates the problem of fake apples by ensuring that Bob and Sues' yield is always a known number of 'real' apples.  If Bob and Sue agree to 25% of all apples yielded annually each, then that would leave 50% for the tree and local ecology.  Further, since the tree is able to bear fruit Bob and Sue might also to invest in future saplings and see even more return.  The local ecosystem is improved by a greater number of available apples, and Bob and Sue are now investors deeply concerned with the health of their tree and that trees children.


Translating the above problem to student debt is very easy.  In our current system investors give freshly minted adults and sometimes even 17 year olds (children!) a choice.  That choice is 'Don't go to school', or instead borrow from the limitless well of student loan money.  This situation is cyclically worsened as Universities increase cost of attendance due to inflated demand, brought on by highly available 'cheap' money (student loans).

The reality is that student loans are anything but cheap.  While they offer nice incentives like "deferment while in school" and 'low' interest rates of 6%, these loans have hidden and obscured costs.  Inability to pay student loans can result in interest rate increase, penalty fees and destruction of credit worthiness.  Attempts to find information on loan consolidation and repayment will point the average apple tree (sorry, American consumer) towards predatory companies that charge fees in exchange for assistance applying to free government programs.  Worse yet, an apple tree could end up in a consolidation program that combines payments into 1, but increases interest overall.  Even Bankruptcy is not an option for student debt.

The problem of student debt repayment is less about people wanting to repay a debt, and more so about being handed a debt they are unable to repay.  Much like fake apples, student loans rely on fake money.  There is no guarantee that a given individual will earn a given amount, or that a fee repayment schedule will allow them to make payments and still prosper.  I propose a new loan structure that would make lenders take on a modicum of risk (of which they presently take none).

Student loans should a 'real' apples loan policy.  This policy would work both for new students and past borrowers looking to convert their debt.  An example 'real' apples policy for a student with an estimated $80000 in student loans would look like this:
On average it is safe to say most 22 year olds leaving college will be employed at least 25 years.
'real' apples approach looks at that $80000 and 25 years, while considering that they would like to make their money grow by 20% in that time.(These numbers can and should fluctuate with the students choice of major, school of choice, and estimated future wages)

$80000 * 1.20 = $96000
$96000 / 25 /12 /2 = $160
current minimum wage ($7.25) * 40 * 2 = $580
$160 is about 27% of a paycheck at minimum wage

Edit: I just re-read this.. WOW my %'s are off. $160 *2 would actually be $320 a month, or %54 percent of income at minimum wage.  That's WELL over the generally accepted 36% debt-to-income ratio lenders use for the purchase of homes, and WAY higher than anyone could afford to exist on while paying.  It seems as though more thought is required here. 

Here is where 'real' apples loans save education and all of society.  If the agreed repayment is capped at %20 of pre-tax income (and really, if you're earning minimum wage you should be well below tax) then anyone that is employed should be able to afford bi-monthly payments.  Lenders gamble that they will receive their entire return on investment over the course of 25 years at a rate of 20% of taxable income per year; Not to exceed the original $96000.  

The lender is incentivized in this model because they can potentially see their return faster.  The borrower is incentivized because as they earn more debt is eliminated faster while remaining a constant, payable, percentage of wages.  Society is a better place because debt is repaid faster, with the added bonus of incentivizing lenders to increase educational lending as the minimum wage increases.  I'll say that again - increasing the minimum wage would organically increase educational funding at a safe and sustainable rate.  

How could someone earning minimum wage be asked to give up 20% of their paycheck?
The reality is they are already being asked to pay that much and more.  Worse yet, student loan repayment is often at a rate a majority of millennial cannot afford (many of whom are in their 30's and still struggling with the payments).

Twenty percent is a terrible return on a potentially 25 year investment.
Since repayment occurs faster as income increases, lenders would also be incentivized to encourage things like job placement, career advancement, selective of lucrative majors and generally "Caring" about the people whom they're presently financially abusing from a young age. (Much like apple tree)

But if lenders get to pick and choose what to fund - THE ARTS WILL FAIL!
Schools will be incentivized to charge according to the cost of the program, and not a flat cost per credit fee in all courses.  Charging fees according to major, based on available money a student can borrow for that major is more respectful to the student than charging $40,000 over 4 years for both an electrical engineering program, and a communications degree.  Schools are creative and full of very smart people, they'll figure it out. 

So.. How about them apples?

Thursday, May 14, 2015

Beautiful Webscraping

I'll be working out the Stack Exchange London office for the next 2 weeks.  Getting there will be a 6 hour flight, and it is likely not to have an internet connection.  On long trips, lacking internet, I like to work on problems from Project Euler.  I'd rather not manually download every problem to my git repo, but with a little bit of Python I can automate it.

A problem on project Euler is presented on a simple page, sampled below:


Looking at the page, we can see that the URL is easily manipulated.  Replacing the number at the end of the URL should get the corresponding problem.  Using the requests library and a for loop should quickly hit every problem page.
After hitting each page, I need to grab the relevant content.  After looking at the HTML in the page, it's clear that the problem is stored in a div tag with the class "problem_content"

BeautifulSoup can parse HTML returned by the requests and spit out just the problem text.  I've included my script below.  The relevant variables should be easy enough to change if anyone would like to include it in their own repo.

Wednesday, May 13, 2015

Configure GADS for Nested Groups

Configuring Google Apps Directory Sync (GADS) to utilize nested AD groups requires that the 'members' field, of the GADS group search rule, be populated with a valid AD filter.  GADS will look at any Unicode attribute in an AD object to fetch that filter.  Before we can point GADS to that attribute, we'll need to populate it.


That filter can be generated a number of ways, but here's an example in power-shell:



The member field of the GADS group rule looking at your target ou should be populated with the attribute which is storing filter information (In our above script, we used the attribute info).  Also ensure that the "Dynamic (query-based) group" checkbox is checked.