Home · Maps · About

Home > SubChat
 

[ Read Responses | Post a New Response | Return to the Index ]
[ First in Thread | Next in Thread ]

 

view flat

Re: W train returning when Phase I of the 2nd Av subway opens in December 2016?.

Posted by Stephen Bauman on Sun Dec 20 21:36:52 2015, in response to Re: W train returning when Phase I of the 2nd Av subway opens in December 2016?., posted by BusRider on Sun Dec 20 19:32:43 2015.

edf40wrjww2msgDetail:detailStr
fiogf49gjkf0d
I placed the csv files into a database (Postgresql). That's what's needed to analyze the data in a meaningful way.

Here are the SQL statements I used to generate the results.

First, I need to figure out what the id's are of the individual stations.

select distinct * from stops
where sq_seq = 7
order by stop_name, stop_id

This just reads the stops file and sorts it alphabetically by name. N.B. I've been keeping track of a lot of data. Each one has its own sequence index. The latest one is 7.

This gives me the id for the Lex/59th station, which is what I'm interested in. It's R11S for the southbound direction.

Next I want to figure out the headways and arrange them in ascending order. That way I can spot the exceptional values.

Here's the SQL.

select aa.* from
(select a.arrival_time, b.route_id, b.trip_headsign, d.departure_time, d.stop_id, e.stop_name,
case
when d.departure_time > a.arrival_time then a.arrival_time - d.departure_time + '24 hour'
else a.arrival_time - d.departure_time
end as travel_time ,
a.arrival_time - lag(a.arrival_time) over (order by a.arrival_time) as headway


from stop_times a
join trips b on (a.sq_seq = b.sq_seq and a.trip_id = b.trip_id)
join calendar c on (b.sq_seq = c.sq_seq and b.service_id = c.service_id)
join stop_times d on (a.sq_seq = d.sq_seq and a.trip_id = d.trip_id)
join stops e on (d.sq_seq = e.sq_seq and d.stop_id = e.stop_id)

where a.sq_seq = 7 and c.wednesday is true and d.stop_sequence = 1 and a.stop_id = 'R11S' and b.route_id in ('N', 'Q', 'R'))aa


order by aa.headway, aa.arrival_time

This is an nested SQL statement so, it's better to work from the inside out.

I'm scanning the stop_times table for the 'R11S' station. Each entry has a trip id, which I use to link to the specific entry in the trips table. The trips table has the route id and the destination sign. I limit the search only to the N, Q and R routes. The trips table also has an entry to the calendar table via the service id. I further limit the search to weekdays (namely Wednesdays). Next I go back to the stop_times table to pick up where the train originated. That's linked by the trip id. The stop where the train originated has a stop sequence of 1. That allows me to pick up the train's departure time and calculate the travel time. I'd like to know the name of that first station, so I link to the stops table via the first station's stop id.

Finally, I need to calculate the difference in arrival times between trains at Lex/59 to calculate the headway. That's where the lag function comes in. The lag function orders a particular field in a given order and returns the previous row's value. Subtracting that value from the current row's arrival time gives me the headway.

I sort the table by headway and station arrival time to give me hints where to look. I run the same query again and order them only by the arrival time to get the data I presented.

I nested the SQL so that I could use the order statement with the column names I created in the inner SQL.

Responses

Post a New Response

Your Handle:

Your Password:

E-Mail Address:

Subject:

Message:



Before posting.. think twice!


[ Return to the Message Index ]