Search This Blog

Thursday, December 26, 2019

Social Security Administration Baby Name Database

Introduction

At this link, you can access the SSA's so-called Baby Name data broken down both nationally and by state.  Each file is a zip file.  When you unzip the State file, it gives you a text file named for each 2-letter state code.  These files use a comma separated values format and each line includes, the state (which seems to me to be redundant), the sex, the year, the name and number of births.  The national data files are broken down by year.  In each year's file, again formatted in CSV, it includes the name, the sex and the number of births.  I downloaded these files and decided I wanted to make a program that would show a graph for a given name, for a given locale over the years.  In the program, I used Python's CSV module to read the file data into lists of lists.  One thing is that CSV defaults every field to strings so you have to convert things to floats.

The UI

So the UI has been made with Tkinter.  It consists of an Entry for the name, a couple of radio selectors for sex, a OptionMenu for the state, a button to generate the graph and a bunch of labels.
I used an Entry for the name information.  For the sex, so as to reduce user entry errors, I used radio buttons.  Again, to ensure data validity in calls to subroutines, I used an OptionMenu object (usually called a dropdown).

The Action

Basically, everything starts after the user hits the Graph button.  This then calls the calc function.  The calc function figures out whether the user selected a national or state level request.  If it is national, then the code calls the national function, if it is state level then it calls the bystate function.
Both the national and bystate functions go and read the appropriate files to build two lists which have corresponding x and y values to be graphed.  These both then call the graph function which uses PyPlot API calls to draw and display the graph in a separate window.  Typical results are below and of course, there's no reason why I would have picked the name Martin for the examples.


The Code

## Code to process downloaded national name data - Using GUI

from tkinter import *
from tkinter import ttk
import matplotlib.pyplot as plt
import csv

filelocation = ##INSERT STRING OF WHERE YOUR FILES ARE

## This function reads the information from the national files.
## Data is Name, M/F, number of births
def national(the_name,sex):
    xdata=[]
    ydata=[]
    for year in range(1880,2019):
        file = open(filelocation+"yob"+str(year)+'.txt', newline='')
        r = csv.reader(file)
        for row in r:
            if row[0]==the_name and row[1]==sex:
                xdata.append(float(year))
                ydata.append(float(row[2]))
        file.close()
    graph(xdata,ydata,the_name,sex,"USA")


## This function reads the information from the State files.
## Data is State,Sex (M/F), Year,Name, Number of births
def bystate(the_name,sex,state):
    xdata=[]
    ydata=[]
    file=open(filelocation+state+'.txt',newline='')
    r = csv.reader(file)
    for row in r:
        if row[3]==the_name and sex==row[1]:
            xdata.append(float(row[2]))
            ydata.append(float(row[4]))
    file.close()
    graph(xdata,ydata,the_name,sex,state)
            
## This part creates the graph using Matplotlib (plt)
def graph(xdata,ydata,the_name,sex,state):
    fig,ax = plt.subplots()
    line1, = ax.plot(xdata,ydata,label=the_name)
    ax.legend(loc='upper left')
    ax.set_title('Births per year in '+state+': '+the_name+' ('+sex+')')
    plt.ylabel('Number of births')
    plt.xlabel('Year')
    plt.show()

def calc():
    mf = ['M','F']
    if sel_state.get()=='USA':
        national(getn.get(),mf[sel_sex.get()])
    else:
        bystate(getn.get(),mf[sel_sex.get()],sel_state.get())
## Below is the data for the state/national selection dropdown
states=['USA','AK','AL','AR','AZ','CA','CO','CT','DC','DE','FL','GA',
        'HI','IA','ID','IL','IN','KS','KY','LA','MA','MD','ME',
        'MI','MN','MO','MS','MT','NC','ND','NE','NH','NJ','NM',
        'NV','NY','OH','OK','OR','PA','RI','SC','SD','TN','TX',
        'UT','VA','VT','WA','WI','WV','WY']
##Below is the set up for the GUI window
root = Tk()
content = ttk.Frame(root)
frame = ttk.Frame(content)
sel_state = StringVar()
sel_sex = IntVar()
lblinstr = ttk.Label(content, text="Enter Name and location")
getn= ttk.Entry(content, text="Name")
male= ttk.Radiobutton(content, text='Male', variable=sel_sex, value=0)
female=ttk.Radiobutton(content, text='Female',variable=sel_sex,value=1)
ok = ttk.Button(content, text="Graph", command=calc)
rlbl = ttk.Label(content, text="Enter Name")
slbl = ttk.Label(content, text="Enter Sex (M/F)")
stlbl = ttk.Label(content, text="Select USA or state")
statedd = ttk.OptionMenu(content, sel_state, *states)
##Below, we put the GUI together
content.grid(column = 0, row = 0)
frame.grid(column=0, row=0)
lblinstr.grid(column=0, row=0)
rlbl.grid(column=0, row=1)
getn.grid(column=1, row=1, sticky=N)
slbl.grid(column=0, row=2)
male.grid(column=1, row=2, sticky=N)
female.grid(column=2, row=2)
statedd.grid(column=1, row=3)
ok.grid(column=0, row=4)
## And run!!
root.mainloop()

Sunday, December 22, 2019

Creating graphs of Federal Reserve GDP data

After having read some interesting speculations about the future of work, AI, and robotics, I ran across a free resource in the form of the St. Louis Federal Reserve bank FRED system.  You do have to sign up with a valid email address, but there is no charge and the government is, well, supposed to spam or sell your email address.
After I had signed up, I configured to download some data.  This was the total annualized quarterly GDP as well as the goods and services components, see the picture below which is a screen shot of data from the FRED system.
The download was a text file, it had a header row and was delimited by tabs.  This is not ideal, but it is something that the csv Python standard library can handle.  The idea was to use matplotlib to graph the data.  I could have tried to create the graph on a Tkinter canvas using the draw functions.  But I am getting ahead of myself...
First we need to read the data and create Lists for the x and y values of the two different lines.  I will run through the code I wrote.  First we need to open the text file:
file = open(filelocation, newline='')
r = csv.reader(file,delimiter='\t')
Then I will initialize variables:
period=[]
goods=[]
services=[]
header=True
The variable header is there so that we can skip over the header row during the loop which reads the data and creates the iterables that will be used to graph the data.  The Pyplot api expects NumPy arrays for the data to be graphed but Lists and most other iterables work.
for row in r:
    if header:
        header=False
    else:
        date=row[0]
        serv=float(row[1])
        good=float(row[2])
        total=serv+good
        dateval=float(date[0:4])+float(date[5:7])/12
        period.append(dateval)
        goods.append(good/total*100)
        services.append(serv/total*100)

Now at the start I had imported matplotlib.pyplot as plt, and the next part of the code creates the graph and shows it using rather simple matplotlib calls.  I did look up a lot of stuff on this in the examples and tutorials, but it is not very clear so I struggled a bit to get it to work right.
fig,ax = plt.subplots()
line1, = ax.plot(period,goods,label='Goods')
line2, = ax.plot(period,services,label='Services')
ax.legend(loc='upper left')
ax.set_title('Change in GDP composition over time')
plt.ylabel('% of GDP')
plt.xlabel('Year')
plt.show()
Here's the final product:

What does this mean?

In this graph we see a long-terms trend wherein the American economy has been more and more reliant on services instead of goods.  We don't make things any more, or maybe a better way to put it, we make things with such high efficiency and productivity that to feed, house, cloth, provide transport and other utilities, some one third of us can provide for everyone.  So we have to find something for the other 2/3rds to do.  That turns out to be providing any and all kinds of services to the general population.  This trend has been around for a while and it probably extends to the left back to the turn of the century although likely it flattens out.  During that time, many have treated it like the weather: something to complain about, but nothing can be done about it.  Even the current politicians are not going to be able to reverse this trend no matter how hard they try.  About the only possible strategy would be to do everything to limit or even reverse the growth in population.  However, this would have other significant adverse consequences, putting the economy into a depression.
One take-away is that if you are starting a new business, concentrate on services rather than goods.  If you have a goods-related business, then consider adding some kind of service provision to complement what you make.  Growth in Services is much more likely, if not easier.

Monday, October 14, 2019

Solving Math Problems (Part 17)

More problems from Advanced Engineering Mathematics, 6th Edition by Kreyszig.

Section 1.4, Problem 14

Solve the following initial value problem:
xyy' = 2y² + 4 x², y(2) = 4
Dividng by we get:
(y/x) y' = 2(y/x)² + 4
Changing the variable u = y/x, y' = u + u'x we get:
u(u + u'x) = 2u² + 4
u² + uu'x = 2u² + 4
uu' / (u² + 4) = 1 / x
Integrating both sides we get:
We now make another change of variable to z such that z = u² + 4 and dz = 2u du.
½ln |z| = ln |x| + C
Changing back to u from z gives
½ln |u² + 4| = ln |x| + C
Taking e to the power of both sides gives:
Re-arranging to isolate u gives:
And finally changing back to y from u we have:


Now using the initial condition, we find that C = 2 and thus the specific function that satisfies the differential equation with this initial condition is:

Section 1.4, Problem 15

Find the function that satisfies:
Rearranging we get:
yy' - xy' = y + x
Dividing by x gives:
(y/x) y' - y' = (y/x) + 1
Now changing variable with u = (y/x) and y' = u + u'x
u (u + u'x) - (u + u'x) = u + 1
u² + uu'x - u - u'x = u + 1
uu'x - u'x = 1 + 2u - u²
u'x(u - 1) = 1 + 2u - u²
We make another change of variable such that z = 1 + 2u - u² and dz = (2 - 2u) du and we get:
Integrating both sides gives:






½ln|z| = ln|x| + C
z = Cx²
Changing back to u from z:
1 + 2u - u² = Cx²
Changing back to y from u:
x² + 2xy - y² = C
Then using the initial condition, we find that C = -4


Saturday, June 29, 2019

Solving Physics Problems (Part 2)

Continuing with solving problems from Understanding Relativity by Sartori.

Problem 1.3:

A jetliner has an air speed of 500 mph.  A 200 mph wind is blowing from west to east.
(a) The plane heads due north.  What direction does the plane fly and what is its ground speed?
Answer:  First, let us define that the S frame shall be as observed on the ground and the S' frame shall be as observed in the plane.  The magnitude of the plane's velocity with respect to the surrounding air is always 500 mph.  Thus we can say that:
The Galilean transformation from S' to S is:
And the equations of motion in S are:
From this we can say the direction the plane flies in is arc tan (200/500) = 21.8 degrees east of north.  The ground speed is






(b) In what direction should the pilot head in order to fly due north?  What is the plane's ground speed in this case?
Answer:  We are trying to find the angle such that the x component of the velocity in the ground frame is 0.  Using the Galilean transformation, we determine that the x component of velocity in the plane's frame has to be -200 mph, but the magnitude still has to remain 500 mph.  So to get the y component, we need to solve:
The direction the plane flies in is arc tan (200/458.3) = 23.6 degrees west of north and the ground speed is 458.3 mph.

Problem 1.4

A river is 20 m wide; a 1 m/s current flows downstream. Two swimmers, A and B, arrange a race.  A is to swim to a point 20 m downstream and back while B straight across and back.  Each can swim 2 m/s in the water.
(a) in what direction should B head in order to swim straight across.  Illustrate with a sketch.
Answer: In the ground frame S, the total magnitude of B's velocity is 2 m/s.  And we know that the x-direction component is 1 m/s.  Using the Pythagorean theorem, we have:
Taking the arc tan of the two sides, we find that the heading is 30 degrees (Ï€/6 rad) towards the upstream from the straight across position.  Here is the sketch:

(b) Who wins the race and by how much time?
Answer: This question is important since the book takes us through the Michelson-Morley ether detection experiment in the following section.
The speed of B, going straight across and back remains the same the entire time at square root 3 m/s and as such, the elapsed time is 23.09 seconds.
The speed of A changes between the downstream and upstream legs.  For going down stream, she goes at 3 m/s taking 6 2/3 seconds, but going upstream, she goes 1 m/s and takes 20 seconds for a total time of 26.7 seconds.  Thus B wins by about 3 1/2 seconds.


Sunday, June 23, 2019

Solving Physics Problems (Part 1)

So I went to Powell's books in downtown Portland and I found a book by Leo Sartori entitled Understanding Relativity: A simplified Approach to Einstein's Theories.  This book deals with Einstein's theory of special relativity.  I've been enthused by the book to try doing the problems at the end of the chapters and I will now share my work with you, dear loyal reader.

Problem 1.1

A train moves at a constant speed.  A stone on the train is released from rest.
(a) using the principle of relativity, describe the motion of the stone as seen by observers on the train.
Answer: As seen by observers on the train, the stone, having the same initial velocity as the train, fall straight down.
(b) Using the Galilean transformation, describe the motion of the stone as seen by observers on the ground.  Draw a sketch.
Answer: To the observer on the ground, the train moves to the right with a speed V.  Therefore, the stone also moves to the right with a speed V as it falls.  See the sketch below:

Problem 1.2

Let V = 30 m/s, h₀ = 7.2 m, approximate g as 10 m/s.
(a) Write the equations that describe the stone's motion in frame S'.
Answer:
S': x' = 0 ; y' = h₀ - ½gt'² ; z' = 0
(b) Use the Galilean transformation to write the equations that describe the position of the stone in frame S.  Plot the curve of the motion of the stone in frame S.
Answer:
The transform from S' → S is as follows:
x = x' + Vt'
y = y'
z = z'
t = t'
The equations of motion in S are:
S: x = Vt ; y = h₀ - ½gt² ; z = 0
Here is the plot of the motion (from Desmos)

(c) Write the equations for the three components of the stone's velocity in S' and use the Galilean transformation to find the component in S.
Answer:
The components of the stone's velocity in S' are:

The Galilean transformation is:
And the components of the stone's velocity in S are:
(d) Find the magnitude of the stone speed in each frame at t = 1 sec.
Answer:
The magnitude of the velocity vector in frame S and S' are:


Saturday, June 22, 2019

Solving Math Problems (part 16)

More problems from Advanced Engineering Mathematics, 6th Edition by Kreyszig.

Section 1.4, Problem 11

Solve the initial value problem:
xy' = x + y, y(1) = -7.4
Since we solved the general solution in Problem 1 and found it to be y = x( ln |x| + C), we will now simply apply the initial condition.
 -7.4 = 1 ( ln | 1 | + C )
∴ C = -7.4
The specific curve is described by:
y = x ( ln |x| -7.4 )

Section 1.5, Problem 12

Solve the initial value problem:
xy' = 2x +2y, y(0.5) = 0
From here, we find the general solution is:
y = C x² - 2x
Substituting the given initial value, we get:
0 = C (0.25) - 2 (0.5)
∴ C = 4
The specific curve is described by:
y = 4x² - 2x

Section 1.5, Problem 13

Solve the initial value problem:
yy' = x³ + y²/x, y(2) = 6
Dividing by x gives:
(y/x) y' = x² + y²/x²
Changing the variable by u = y/x and y' = u + u'x, we get:
u ( u + u'x ) = x² + u²
Separating the variables and integrating gives:
∫u du = ∫x dx
½u² = ½x² + C

Converting to y gives:

Using the given initial value, we have:

∴ C = 5
The specific solution is

Friday, June 21, 2019

Solving Math Problems (Part 15)

More problems from Advanced Engineering Mathematics, 6th Edition by Kreyszig.

Section 1.4, Problem 9

Find the general solution of the following differential equation:
Dividing by x we get:
Changing the variable using u = y/x and y' = u+ u'x, we get:
Simplifying, we get:
Integrating both sides gives:
Changing back to y gives:

Section 1.4, Problem 10

Find the general solution of the following differential equation:
xy' - y -x² tan (y/x) = 0
Dividing by x gives:
y' - (y/x) - x tan (y/x) = 0
Making a change of variable such that u = y/x and y' = u+ u'x, we get:
u + u'x - u - x tan u = 0
Simplifying and separating the variables, we have:
u' cot u = 1
Integrating we get:
ln | sin u | = x + C
Isolating the u gives:
Converting back to y we get:











Wednesday, June 19, 2019

Solving Math Problems (Part 14)

In doing the problems in section 1.4, I had noticed that a few of the problems seemed to fit a larger classification.  See, for example, problems 1 and 2  In this post, I am going to solve a generalized equation of one sort:
xy' = ax + by
Because there is a point where we divide by b-1, we have to consider two situations.  The simpler is where b = 1.  In this case, we divide everything by x and then use a change of variable where u = y/x and y' = u + u'x.  From this we get:
u + u'x = a + u
u' = a/x
Integrating we get:
u = a ln |x| + C
And converting back to y from u:
y = x (a ln |x| + C )
In the case where b ≠ 1, and using the same change of variable as above, we get:
u + u'x = a + bu
u'x = a + (b-1)u
Integrating both sides, and to be explicit, we will change the variable again by z = a + (b-1)u and dz = (b-1)du to get:
Taking e to the power of both sides, we get:
Taking both sides to the power of (b-1) so as to get z to the first power and then changing back to u, we get:
Rearranged to isolate u, we get:
And converting back to y, we get:
Where c* is defined as:


Summary

Where the differential equation takes the form of
xy' = ax + by
The solution is